Architecture Blueprint

AI-Powered SNF Referral
Management Platform

End-to-end architecture โ€” from faxed PDFs to admit/deny recommendations

๐Ÿ“  Faxed Referrals
๐Ÿฅ Hospital Portals (Epic, AllScripts, NaviHealth)
๐Ÿ“„ Direct PDF Uploads
๐Ÿ“ง E-Fax (Twilio)
๐Ÿ”— HL7v2 / FHIR Feeds
1
๐Ÿ“ฅ Document Ingestion & OCR
300+ page referral packets โ†’ clean structured text ยท <3 min per packet
๐Ÿ“ Marker
PDF โ†’ Markdown with 96% accuracy. Tables, multi-column, headers preserved. LLM-enhanced mode.
PRIMARY OCR ยท GPL-3.0
๐Ÿ” Surya
Transformer-based OCR. 97% similarity vs Google Cloud Vision. 90+ languages, 0.62s/page.
SCANNED DOCS ยท GPL-3.0
โœ๏ธ TrOCR
Microsoft's handwriting recognition. 94.6% accuracy on medical handwritten forms.
HANDWRITING ยท MIT
๐Ÿ‘๏ธ Qwen2.5-VL
Vision-language model. Reads complex pages visually โ€” tables, charts, mixed content. 7B params.
COMPLEX DOCS ยท Apache 2.0
๐Ÿ“ Docling
IBM layout analysis. Document structure understanding, reading order, section detection.
LAYOUT ยท MIT
๐Ÿ”ง OpenCV
Preprocessing: deskew, denoise, contrast enhance, upscale to 300 DPI. +5-10% accuracy on faxes.
PREPROCESSING ยท Apache 2.0
2
๐Ÿงฌ Clinical NLP & Entity Extraction
Raw text โ†’ structured medical entities (diagnoses, meds, insurance, demographics)
๐Ÿฅ scispaCy
Allen AI clinical NLP. Medical NER: diagnoses, medications, procedures, anatomy. Battle-tested.
CORE NLP ยท Apache 2.0
๐Ÿง  BioBERT
Fine-tuned on n2c2/i2b2 datasets. ICD-10 code extraction, clinical relation detection.
NER MODEL ยท Apache 2.0
๐Ÿ’Š MedXN
Mayo Clinic medication extractor. Drug names โ†’ RxNorm normalized codes. Interaction checking.
MEDICATIONS ยท Open Source
๐Ÿ”’ Presidio
Microsoft PHI de-identification. 94% recall on patient names. HIPAA compliance layer.
DE-ID ยท MIT
๐Ÿ—๏ธ FHIR Resources
Map entities to FHIR Patient, Condition, MedicationStatement, Coverage. Interoperable output.
STANDARDS ยท Open Source
3
๐Ÿง  Intelligence Engine โ€” Fine-Tuned LLM + RAG
Clinical reasoning, facility matching, financial analysis ยท Custom-trained on healthcare data
โšก Qwen 2.5 (32B)
Base model fine-tuned with QLoRA on MIMIC-III/IV clinical data. Top clinical reasoning scores.
BASE MODEL ยท Apache 2.0
๐Ÿ”ง LLaMA-Factory
Fine-tuning framework. QLoRA r=64, 72% MedQA accuracy. Trains 32B model on single A100 in ~12hrs.
FINE-TUNING ยท Apache 2.0
๐Ÿ“š LlamaIndex + LangChain
Hybrid RAG. Facility criteria, payer rules, drug DBs, ICD-10 codes injected per query.
RAG FRAMEWORK ยท MIT/Apache
๐Ÿ—„๏ธ Weaviate
Vector database. Multi-tenant (per facility), hybrid search, on-prem HIPAA deployment.
VECTOR DB ยท BSD-3
๐Ÿ“‹ SGLang / Outlines
Guaranteed valid JSON extraction. FHIR-compatible schemas with per-field confidence scores.
STRUCTURED OUTPUT ยท Apache 2.0
๐Ÿ“Š RAGAS + DeepEval
Evaluate faithfulness, relevancy, hallucination rate. Custom clinical accuracy metrics.
EVALUATION ยท Apache 2.0
4
๐Ÿค– Multi-Agent Decision System (LangGraph)
Specialized agents collaborate to produce admit/consider/deny with transparent reasoning
๐Ÿ”€
Triage Agent
Classify urgency & route
โ†’
๐Ÿฉบ
Clinical Agent
Risk assessment & care needs
โ†’
๐Ÿ’ฐ
Financial Agent
PDPM, insurance, med costs
โ†’
โœ…
Criteria Agent
Facility match via RAG
โ†’
๐Ÿ“
Explanation Agent
Reasoning + page citations
5
๐Ÿ“ค Decision Output & Integrations
Structured recommendations pushed to EHRs, dashboards, and clinical review queues
โš–๏ธ Admit / Consider / Deny
Transparent recommendation with confidence score, reasoning chain, and source page citations from the referral packet.
PRIMARY OUTPUT
๐Ÿ“Š Clinical Summary
Single-page patient overview: diagnoses, medications, risks, care needs, financial projections.
SUMMARY
๐Ÿ”Œ EHR Push
PointClickCare + MatrixCare integration. FHIR R4 resources. Bidirectional sync.
INTEGRATION
๐Ÿ‘จโ€โš•๏ธ Human Review Queue
Low-confidence items routed to clinicians. Override feedback loops back into model training.
HUMAN-IN-THE-LOOP
โš™๏ธ
๐Ÿญ Production Infrastructure
HIPAA-compliant, scalable, monitored โ€” 240+ packets/day on a single GPU
๐Ÿš€ vLLM
Model serving. 24x throughput vs alternatives. AWQ quantization. OpenAI-compatible API.
SERVING ยท Apache 2.0
โ˜๏ธ HealthStack
Open-source IaC for AWS. HIPAA Terraform modules: VPC, encryption, audit logging, BAA-ready.
INFRASTRUCTURE ยท OSS
๐Ÿ“ˆ MLflow + W&B
Experiment tracking, model registry, A/B testing. W&B has HIPAA BAA for enterprise.
MLOPS ยท Apache/Comm
๐Ÿ–ฅ๏ธ NVIDIA L4 GPU
24GB VRAM, $1.50/hr on AWS. Single GPU handles full pipeline. Scale to multi-GPU as needed.
COMPUTE ยท $800/mo
๐ŸŒ FastAPI
Async API layer. OAuth2/OIDC auth. Webhook callbacks. Rate limiting via Celery + Redis.
API ยท MIT
๐Ÿ“Š Grafana + Prometheus
Observability: latency, accuracy drift, error rates, GPU utilization, model performance.
MONITORING ยท OSS
18
Weeks to Build
96%
OCR Accuracy
<3m
Per Packet
240+
Packets/Day
$2.4K
Monthly Cost
100%
Open Source Core