The tension
GPUs are scarce, so sharing matters
An H100 is too big for a small model and too small for a 70B model. The platform has to decide: hand out whole GPUs, slice them, or coordinate several. Each choice has a different fairness, isolation, and performance shape.