Reference engagement

Per-User Data Access Governance for an Internal LLM API

A professional services firm built a permission-aware LLM API that enforces document-level access controls, ensuring users can only retrieve and reason over data they are authorised to see.

professional-services Security Engineering RAG & Company Memory

Scope a similar engagement →

// Delivery pattern

This page describes a representative engagement of this shape — how the system is scoped, built, and handed over. Specific figures reflect typical outcomes of the pattern when delivered with the operational discipline described on the About page. Named customer engagements are shared under NDA on request.

Engagement shape

Typical outcomes

✓ No data-boundary violations in a 90-day production window
✓ Compliance review passed without remediation items on first submission
✓ Any retrieval event reconstructable within 2 minutes from the audit log

Stack

— pgvector with per-row access classification columns
— Pre-query access control filter in the application layer
— SSO/RBAC roles mapped to document classification tiers
— Append-only Postgres audit log with daily integrity hash

Typical timeline

3–4 weeks

kick-off to handover

Risks & guardrails

Permission model complexity grows as classification tiers increase
Pre-filter performance overhead at high retrieval volumes — requires indexing strategy
Access classifications must be maintained at ingest time — stale tags create gaps

Challenge

A professional services firm was building an internal knowledge assistant that let employees query project documents, client records, and internal policies. The initial architecture retrieved documents from a shared vector store without any per-user permission filtering. A junior analyst could, in principle, retrieve documents classified for senior partners or documents belonging to a different client engagement.

The firm's information security policy required strict need-to-know access. The team needed a governance layer before the tool could be approved for internal use.

Approach

Access-aware retrieval: Modified the retrieval layer to pass the authenticated user's permission set into the query. Documents are tagged with access classifications at ingest time. The retrieval step applies a pre-filter that excludes documents the user is not authorised to see before any LLM call is made.

Audit log for access decisions: Every retrieval request is logged with user identity, document IDs retrieved, access classifications checked, and a timestamp. Denials are also logged with reason codes — not silently dropped.

Prompt boundary enforcement: The assembled context passed to the LLM includes only documents that cleared the access filter. The system prompt explicitly states that the model must not reference, infer, or speculate about content outside the provided context.

Weekly access review: Built a lightweight dashboard surfacing access patterns, repeated denials (potential misuse signals), and documents frequently accessed across classification boundaries.

Typical Outcomes

Outcomes observed in this engagement — not guarantees for every deployment:

No data-boundary violations found in a 90-day production window (access logs reviewed weekly by the information security team)
Compliance review passed without remediation items on the first submission
Any retrieval event reconstructable within 2 minutes from the audit log, meeting internal record-keeping policy

Technical Stack

Vector store: pgvector with per-row access classification columns
Retrieval filter: pre-query access control check in the application layer (not delegated to the LLM)
Auth: existing SSO / RBAC roles mapped to document classification tiers
Audit log: append-only Postgres table, daily integrity hash

Related patterns

Cross-industry

AI Attack Surface & Threat Modeling

Identify weak points in AI-enabled systems and design defenses that hold up in production.

securitythreat-modelinggovernance

→

financial-services

Automated Financial Report Generation

An asset manager reduced monthly reporting time by around 70% by deploying a validated LLM pipeline that drafts variance commentary directly from ERP data.

financeautomationreporting

→

Scope a similar engagement

Does this pattern fit your situation?

Tell me the system you're trying to integrate and the outcome you're measured on. You'll get a clear next step — a readiness audit, a prototype plan, or a delivery proposal.

Start a scoping conversation → How engagements are run →