Architecture Blueprint
AI-Powered SNF Referral
Management Platform
End-to-end architecture โ from faxed PDFs to admit/deny recommendations
๐
Faxed Referrals
๐ฅ
Hospital Portals (Epic, AllScripts, NaviHealth)
๐
Direct PDF Uploads
๐ง
E-Fax (Twilio)
๐
HL7v2 / FHIR Feeds
1
๐ฅ Document Ingestion & OCR
300+ page referral packets โ clean structured text ยท <3 min per packet
๐ Marker
PDF โ Markdown with 96% accuracy. Tables, multi-column, headers preserved. LLM-enhanced mode.
PRIMARY OCR ยท GPL-3.0
๐ Surya
Transformer-based OCR. 97% similarity vs Google Cloud Vision. 90+ languages, 0.62s/page.
SCANNED DOCS ยท GPL-3.0
โ๏ธ TrOCR
Microsoft's handwriting recognition. 94.6% accuracy on medical handwritten forms.
HANDWRITING ยท MIT
๐๏ธ Qwen2.5-VL
Vision-language model. Reads complex pages visually โ tables, charts, mixed content. 7B params.
COMPLEX DOCS ยท Apache 2.0
๐ Docling
IBM layout analysis. Document structure understanding, reading order, section detection.
LAYOUT ยท MIT
๐ง OpenCV
Preprocessing: deskew, denoise, contrast enhance, upscale to 300 DPI. +5-10% accuracy on faxes.
PREPROCESSING ยท Apache 2.0
2
๐งฌ Clinical NLP & Entity Extraction
Raw text โ structured medical entities (diagnoses, meds, insurance, demographics)
๐ฅ scispaCy
Allen AI clinical NLP. Medical NER: diagnoses, medications, procedures, anatomy. Battle-tested.
CORE NLP ยท Apache 2.0
๐ง BioBERT
Fine-tuned on n2c2/i2b2 datasets. ICD-10 code extraction, clinical relation detection.
NER MODEL ยท Apache 2.0
๐ MedXN
Mayo Clinic medication extractor. Drug names โ RxNorm normalized codes. Interaction checking.
MEDICATIONS ยท Open Source
๐ Presidio
Microsoft PHI de-identification. 94% recall on patient names. HIPAA compliance layer.
DE-ID ยท MIT
๐๏ธ FHIR Resources
Map entities to FHIR Patient, Condition, MedicationStatement, Coverage. Interoperable output.
STANDARDS ยท Open Source
3
๐ง Intelligence Engine โ Fine-Tuned LLM + RAG
Clinical reasoning, facility matching, financial analysis ยท Custom-trained on healthcare data
โก Qwen 2.5 (32B)
Base model fine-tuned with QLoRA on MIMIC-III/IV clinical data. Top clinical reasoning scores.
BASE MODEL ยท Apache 2.0
๐ง LLaMA-Factory
Fine-tuning framework. QLoRA r=64, 72% MedQA accuracy. Trains 32B model on single A100 in ~12hrs.
FINE-TUNING ยท Apache 2.0
๐ LlamaIndex + LangChain
Hybrid RAG. Facility criteria, payer rules, drug DBs, ICD-10 codes injected per query.
RAG FRAMEWORK ยท MIT/Apache
๐๏ธ Weaviate
Vector database. Multi-tenant (per facility), hybrid search, on-prem HIPAA deployment.
VECTOR DB ยท BSD-3
๐ SGLang / Outlines
Guaranteed valid JSON extraction. FHIR-compatible schemas with per-field confidence scores.
STRUCTURED OUTPUT ยท Apache 2.0
๐ RAGAS + DeepEval
Evaluate faithfulness, relevancy, hallucination rate. Custom clinical accuracy metrics.
EVALUATION ยท Apache 2.0
4
๐ค Multi-Agent Decision System (LangGraph)
Specialized agents collaborate to produce admit/consider/deny with transparent reasoning
๐
Triage Agent
Classify urgency & route
โ
๐ฉบ
Clinical Agent
Risk assessment & care needs
โ
๐ฐ
Financial Agent
PDPM, insurance, med costs
โ
โ
Criteria Agent
Facility match via RAG
โ
๐
Explanation Agent
Reasoning + page citations
5
๐ค Decision Output & Integrations
Structured recommendations pushed to EHRs, dashboards, and clinical review queues
โ๏ธ Admit / Consider / Deny
Transparent recommendation with confidence score, reasoning chain, and source page citations from the referral packet.
PRIMARY OUTPUT
๐ Clinical Summary
Single-page patient overview: diagnoses, medications, risks, care needs, financial projections.
SUMMARY
๐ EHR Push
PointClickCare + MatrixCare integration. FHIR R4 resources. Bidirectional sync.
INTEGRATION
๐จโโ๏ธ Human Review Queue
Low-confidence items routed to clinicians. Override feedback loops back into model training.
HUMAN-IN-THE-LOOP
โ๏ธ
๐ญ Production Infrastructure
HIPAA-compliant, scalable, monitored โ 240+ packets/day on a single GPU
๐ vLLM
Model serving. 24x throughput vs alternatives. AWQ quantization. OpenAI-compatible API.
SERVING ยท Apache 2.0
โ๏ธ HealthStack
Open-source IaC for AWS. HIPAA Terraform modules: VPC, encryption, audit logging, BAA-ready.
INFRASTRUCTURE ยท OSS
๐ MLflow + W&B
Experiment tracking, model registry, A/B testing. W&B has HIPAA BAA for enterprise.
MLOPS ยท Apache/Comm
๐ฅ๏ธ NVIDIA L4 GPU
24GB VRAM, $1.50/hr on AWS. Single GPU handles full pipeline. Scale to multi-GPU as needed.
COMPUTE ยท $800/mo
๐ FastAPI
Async API layer. OAuth2/OIDC auth. Webhook callbacks. Rate limiting via Celery + Redis.
API ยท MIT
๐ Grafana + Prometheus
Observability: latency, accuracy drift, error rates, GPU utilization, model performance.
MONITORING ยท OSS
18
Weeks to Build
96%
OCR Accuracy
<3m
Per Packet
240+
Packets/Day
$2.4K
Monthly Cost
100%
Open Source Core