🧠 AI System Design

📋 Full Curriculum

All 30 days at a glance. Status: Ready = content complete, Stub = structure ready, needs research expansion.

Week 1: Foundations

7 days
DayTitleTopicStatus
Day 1 ML System Blueprint The three pillars: compute, storage, orchestration Ready
Day 2 Sync vs Async Inference Request-response, batch, streaming — tradeoffs in latency, throughput, cost Ready
Day 3 Caching Strategies Semantic caching, KV-cache reuse, prompt caching Ready
Day 4 Load Balancing & Routing Round-robin, least-connections, semantic routers Ready
Day 5 Stateless vs Stateful Inference KV cache, conversation history, managing state at scale Ready
Day 6 Rate Limiting & Quotas Token bucket, sliding window, user-tier enforcement Ready
Day 7 Mini-Project — AI Gateway Containerise your proxy, cache, and rate limiter into a unified gateway Ready

Week 2: Data & Training

7 days
DayTitleTopicStatus
Day 8 Data Pipelines Extract, transform, embed, store — batch vs streaming Stub
Day 9 Vector Databases Index types (IVF, HNSW), tradeoffs, hybrid search Stub
Day 10 RAG Architecture Ingestion pipeline, retriever, reranker, generator Stub
Day 11 Distributed Training 101 Data parallelism, model parallelism, pipeline parallelism Stub
Day 12 Checkpointing & Fault Tolerance Save/restore training state, preemption handling, spot instances Stub
Day 13 Experiment Tracking Structured logging, hyperparameter sweeps, model registry Stub
Day 14 Mini-Project — End-to-End RAG Integrate pipeline + vector DB + LLM into a complete RAG system Stub

Week 3: Serving & Inference

7 days
DayTitleTopicStatus
Day 15 Inference Optimization Quantization (GGUF, GPTQ, AWQ), throughput, quality tradeoffs Stub
Day 16 Continuous Batching & Speculative Decoding How vLLM/TGI achieve high throughput Stub
Day 17 Prefill vs Decode The two phases of transformer inference Stub
Day 18 GPU vs CPU Offloading Layer placement, PCIe bottlenecks, memory hierarchy Stub
Day 19 Streaming & SSE Token streaming fan-out, head-of-line blocking prevention Stub
Day 20 Model Adapters & LoRA Adapter swapping, multi-task serving, parameter-efficient fine-tuning Stub
Day 21 Mini-Project — Inference Benchmark Suite Script that sweeps parameters and produces a comparison table Stub

Week 4: Production & Case Studies

9 days
DayTitleTopicStatus
Day 22 Observability Metrics, traces, logs — the three pillars for AI systems Stub
Day 23 Guardrails & Safety Input/output filtering, PII detection, prompt injection defense Stub
Day 24 A/B Testing & Canary Deployments Shadow traffic, gradual rollout, automated rollback Stub
Day 25 Case Study: ChatGPT The infrastructure behind a global chat product Stub
Day 26 Case Study: Perplexity Real-time web search + RAG at scale Stub
Day 27 Case Study: GitHub Copilot Context window management, code-specific embeddings, fast completion Stub
Day 28 Cost Engineering Token economics, cache hit rates, model selection by task difficulty Stub
Day 29 Scaling Law Intuition How data/compute affects system cost — hardware vs optimisation Stub
Day 30 Final Project: Design a Production AI System End-to-end architecture design, from concept to deployment Stub