Model serving on Kubernetes
1 / 9
Model data: weights, formats, and ModelcarsWhere do 30 GB of weights actually live, and how do they get to the GPU?
Problem

Weights are not application data

Model weights are large, immutable, versioned, and hot-read by every replica on cold start. Treating them like a config file does not scale. Treat them like binaries with their own supply chain.