Observability for LLM systems
1 / 9
Metrics that matter for LLM servingTTFT, TPOT, throughput — and why CPU/memory miss the point
Framing

Old dashboards do not tell the new story

Standard infra dashboards say the GPU is busy. They do not say whether users wait, whether tokens stream smoothly, or whether the server is stuck queueing. LLM-specific metrics close that gap.