From Your Documents to Audit-Ready Answers

Generic RAG retrieves text. ComplianceGxP understands compliance.

01 📂

Ingest Your Documents

Upload SOPs, validation protocols, regulatory guidelines, deviations, and CAPAs. Each document is parsed (PDF/DOCX/XLSX/TXT), chunked along pharma-aware section boundaries, and embedded into a private FAISS index isolated per namespace.

SOP · Validation Plan · Deviation · CAPA · Regulatory · Audit
02 🧠

25 Years of Pharma Expertise, Built In

Each mode is driven by specialised system prompts written from deep GxP experience — encoding how a real validation engineer thinks: which sections of a deviation matter, what a CAPA must contain to survive an FDA inspection, what “risk-based” means in the context of GAMP5 Category 4 software.

Domain knowledge you cannot prompt-engineer in an afternoon
03 🎯

Choose Your Compliance Workflow

Each mode runs a structured workflow — not a generic “chat with your docs” interface. QA surfaces sourced policy answers. Deviation runs 5M analysis with classification. CAPA drafts corrective actions with regulatory traceability. CSV produces protocol outlines aligned to IQ/OQ/PQ.

QA Deviation CAPA CSV
04

Structured, Sourced, Audit-Ready

Every answer cites the exact document and section it came from, with similarity score. Every query is logged in JSONL with timestamp, user, mode, and sources. Outputs match the structure your QA team already knows — ready to paste into your QMS.

[Major Deviation] → 5M Root Cause → Actions → Owner
Source: SOP-DEV-042 §4.1 · 21 CFR Part 11 §11.10(e)

Architecture at a Glance

FastAPI service. FAISS vector store. Claude Sonnet for generation. SHA-256-hashed keys. JSONL audit. Docker + Caddy for deploy.

🔗

RAG Pipeline

Documents → pharma-aware chunker → local embeddings (all-MiniLM-L6-v2, 384-dim) or OpenAI embeddings → FAISS IndexFlatIP (cosine, threshold 0.3) → Claude Sonnet generation with mode-specific system prompts.

🛡

Multi-Tenant Isolation

Each client gets a separate FAISS index, metadata file, and JSONL audit log under data/clients/<namespace>/. Per-key tier flags (can_ingest, can_read_logs) enforce least-privilege access.

📋

Validation Package

IQ-CGXP-001 (16 test cases — env, deps, file I/O), OQ-CGXP-001 (33 test cases — auth, ingestion, all 4 modes, audit log), PQ-CGXP-001 template per client. Run via python tests/validation/run_oq.py.

🤖

Agent-Ready Surfaces

HTTP API with auto-generated Swagger UI at /docs. MCP server endpoint at /mcp/ for direct connection from Claude AI & Claude Code. SHARP headers for healthcare agent interop.

Self-Host in 5 Minutes

Run ComplianceGxP on your own infrastructure — your documents, your keys, your data boundary.

# 1. Clone the repository
git clone https://github.com/llmops-pro/ComplianceGxP
cd ComplianceGxP

# 2. Configure environment
cp .env.example .env
#    → set ANTHROPIC_API_KEY=sk-ant-…
#    → set ADMIN_API_KEY=$(python -c "import secrets; print(secrets.token_urlsafe(32))")
#    → set DOMAIN=compliance.yourcompany.com

# 3. Start the stack (FastAPI + Caddy HTTPS)
docker compose up -d

# 4. Open the web UI
#    → https://compliance.yourcompany.com/ui/
1

Ingest Your Documents

Use the CLI or the Upload tab in the web UI. Supports PDF, DOCX, XLSX, TXT — with automatic chunk-on-section-boundary parsing.

compliancegxp ingest \
  --client my-company \
  --path ./docs/
2

Query the API

Every answer returns source citations with similarity scores and a timestamp for the audit trail. Use the web UI or call the API directly.

curl -X POST https://your-host/api/v1/query \
  -H "X-API-Key: cgxp-…" \
  -d '{"query":"CSV policy?",
       "mode":"qa"}'
3

Four Compliance Modes

Switch mode per query — each mode runs a different system prompt and output structure.

mode: "qa"        # sourced Q&A
mode: "deviation" # 5M analysis
mode: "capa"      # CAPA drafting
mode: "csv"       # IQ/OQ/PQ outlines
4

Multi-Tenant & Isolated

Each client gets an isolated FAISS index, metadata, and audit log. Perfect for CDMOs running separate instances per sponsor.

data/clients/
  ├── sponsor-a/   # isolated
  ├── sponsor-b/   # isolated
  └── internal/    # your docs

What ships in the box:

QA Mode
Sourced Q&A · citations
Deviation Mode
5M RCA · classification
CAPA Mode
Corrective action drafts
CSV Mode
IQ / OQ / PQ outlines
GAMP5 Validation
IQ/OQ/PQ test suite
Audit Trail
JSONL · per query
MCP Server
Claude AI & Code interop
SHARP
Healthcare agent protocol

Need a managed instance, validation evidence, or your own SOP library ingested?
See the Private Pilot Programme →

📝

Interactive API Docs

The full API surface is auto-documented via Swagger UI at /docs and ReDoc at /redoc. Raw OpenAPI 3.1 spec at /openapi.json.

Ready to try this with your own documents?

We’re accepting a limited number of qualifying pharma, biotech, and CDMO teams for a free 90-day pilot — ComplianceGxP trained on your actual SOPs, validation protocols, and regulatory guidelines.

Apply for Free Pilot