med-routing · 4 uncertainty metrics × 3 tiers

same prompt, same tiers (nano → mini → base) — four DIFFERENT uncertainty metrics decide when to escalate. rows = metric, columns = tier. shared per-tier work (1 main answer + N samples + logprobs) feeds all four scores. free-form medical Q&A; LLM-as-judge grades each row's committed answer against a reference when available.

Batch run — sweep N random questions through the full grid; reports cumulative cost & tier distribution.

Audit log — GDPR Art. 30 record. Prompt-hash only, no PHI. Refreshes after each run.

time	prompt sha	router	score	final tier	chain → regions	cost	latency