gen-ai · k8s
Kube + LLM, made readable
LearnConceptsReviewPlaygroundSaved
Learn
Your dashboard

Pick up where you left off.

A microlearning curriculum for running generative AI workloads on Kubernetes.

Lessons completed
0
of 21
Saved insights
—
Save what matters from any card
Concepts
22
Browse the full library

All paths

Re-tune
  • Path 01 · Foundation

    Why Kubernetes for generative AI

    You will be able to defend the choice of Kubernetes for an LLM workload — and explain what changes when the workload is a 30 GB model.

    0/3
  • Path 02 · Practical

    Model serving on Kubernetes

    You will know how to pick a model server, declare it with KServe, and deliver weights without baking them into your image.

    0/3
  • Path 03 · Practical

    GPU scheduling and resource management

    You will know how Kubernetes discovers GPUs, when to share them, and how to plan tensor and pipeline parallelism.

    0/3
  • Path 04 · Advanced

    Scaling, routing, and disaggregated serving

    You will be able to design an autoscaling, cache-aware, cost-aware inference plane that survives bursty traffic.

    0/3
  • Path 05 · Practical

    Observability for LLM systems

    You will know which metrics actually matter (TTFT, TPOT, goodput) and how to wire logs, metrics, and traces for streaming workloads.

    0/3
  • Path 06 · Advanced

    Tuning at scale: LoRA and HPC scheduling

    You will know when to fine-tune, how LoRA changes the serving story, and what gang and topology-aware scheduling buy you.

    0/3
  • Path 07 · Advanced

    AI-driven apps: RAG and agents

    You will be able to architect a RAG pipeline and a safe agentic system on Kubernetes, with state, identity, and failure domains in mind.

    0/3
Concept library

22 ideas, one diagram each. The fastest way to look something up.

Open
Review mode

Spaced flashcards built from definitions, decisions, and failure modes.

Open
Architecture playground

Compose a real platform. See where it leaks before users do.

Open