G|AI Works G|AI Works

Use Case

Per-User Data Access Governance for an Internal LLM API

A professional services firm built a permission-aware LLM API that enforces document-level access controls, ensuring users can only retrieve and reason over data they are authorised to see.

professional-services Security Engineering

Start a project

At a glance

Outcomes

  • No data-boundary violations in a 90-day production window
  • Compliance review passed without remediation items on first submission
  • Any retrieval event reconstructable within 2 minutes from the audit log

Stack

  • pgvector with per-row access classification columns
  • Pre-query access control filter in the application layer
  • SSO/RBAC roles mapped to document classification tiers
  • Append-only Postgres audit log with daily integrity hash

Typical timeline

3–4 weeks

kick-off to handover

Risks & guardrails

  • Permission model complexity grows as classification tiers increase
  • Pre-filter performance overhead at high retrieval volumes — requires indexing strategy
  • Access classifications must be maintained at ingest time — stale tags create gaps

Challenge

A professional services firm was building an internal knowledge assistant that let employees query project documents, client records, and internal policies. The initial architecture retrieved documents from a shared vector store without any per-user permission filtering. A junior analyst could, in principle, retrieve documents classified for senior partners or documents belonging to a different client engagement.

The firm's information security policy required strict need-to-know access. The team needed a governance layer before the tool could be approved for internal use.

Approach

Access-aware retrieval: Modified the retrieval layer to pass the authenticated user's permission set into the query. Documents are tagged with access classifications at ingest time. The retrieval step applies a pre-filter that excludes documents the user is not authorised to see before any LLM call is made.

Audit log for access decisions: Every retrieval request is logged with user identity, document IDs retrieved, access classifications checked, and a timestamp. Denials are also logged with reason codes — not silently dropped.

Prompt boundary enforcement: The assembled context passed to the LLM includes only documents that cleared the access filter. The system prompt explicitly states that the model must not reference, infer, or speculate about content outside the provided context.

Weekly access review: Built a lightweight dashboard surfacing access patterns, repeated denials (potential misuse signals), and documents frequently accessed across classification boundaries.

Typical Outcomes

Outcomes observed in this engagement — not guarantees for every deployment:

  • No data-boundary violations found in a 90-day production window (access logs reviewed weekly by the information security team)
  • Compliance review passed without remediation items on the first submission
  • Any retrieval event reconstructable within 2 minutes from the audit log, meeting internal record-keeping policy

Technical Stack

  • Vector store: pgvector with per-row access classification columns
  • Retrieval filter: pre-query access control check in the application layer (not delegated to the LLM)
  • Auth: existing SSO / RBAC roles mapped to document classification tiers
  • Audit log: append-only Postgres table, daily integrity hash

Ready to scope this?

Let's talk about your project.

Tell us what you're building. We'll respond with a clear next step: an audit, a prototype plan, or a delivery proposal.

Start a project →