Question Categorization & Labeling System
Medical Coding Categories:
- CPT Questions: Procedure codes (Current Procedural Terminology)
- HCPCS Questions: Healthcare Common Procedure Coding System
- ICD-10-CM Questions: International Classification of Diseases diagnosis codes
- Medical Knowledge: General medical terminology and concepts
- Anatomy & Physiology: Body systems and medical fundamentals
Implementation:
- Metadata Tags: Questions tagged with secondary attributes (complexity, body system, specialty)
Evaluation Metrics Framework
Based on question answering system evaluation best practices, the system uses:
Core Performance Metrics:
- Exact Match (EM): Binary scoring for perfect answer matches
- Technology: String comparison algorithms
- Benchmark: Target >85% for high-confidence answers
- Usage: Strict evaluation for multiple choice questions
- Semantic Answer Similarity (SAS): AI-based semantic matching
- Technology: Transformer-based cross-encoder architecture
- Purpose: Handles equivalent answers with different wording (e.g., "100%" vs "one hundred percent")
- Integration: Available through Haystack NLP framework
- Confidence Calibration: Reliability of confidence scores
- Metric: Correlation between predicted confidence and actual accuracy
- Target: High confidence (>90%) should correlate with >95% accuracy
- Tool: Statistical analysis using
scipy.stats
Session Management & Analytics
Individual Question Tracking:
-
Technology: loguru
structured logging with JSON format
-
Captured Metrics:
{
"question_id": "q_001",
"category": "CPT",
"processing_time": 3.2,
"confidence_score": 0.87,
"tools_used": ["cpt_specialist", "procedures_api"],
"correct": true,
"agent_reasoning": "CPT 70551 for MRI brain without contrast"
}