Learn

Things I've worked with, written down.

Structured series — read in order, or jump to what you need.

LLMs from Zero to Hero

LLM Engineering

17 parts · 17 published

A complete guide to understanding, building, and deploying large language models — from tokenization and attention to fine-tuning, alignment, and the frontier. Written for practitioners who want to go deep.

Start reading →

✓01A Brief History of Language ModelsJan 2026→✓02Tokens, Embeddings, and AttentionJan 2026→✓03Attention: A Deep Dive with IntuitionFeb 2026→✓04From GPT to LLaMA: Modern LLM ArchitecturesFeb 2026→✓05Architecture Improvements That Actually MatterFeb 2026→✓06Pre-training Data and Tokenization at ScaleMar 2026→✓07Scaling Laws: How Models Get BetterMar 2026→✓08Supervised Fine-tuning: From Model to AssistantMar 2026→✓09LoRA and QLoRA: Fine-tuning on a BudgetMar 2026→✓10RLHF and DPO: Alignment from Human FeedbackApr 2026→✓11Quantization: Smaller, Faster, Still GoodApr 2026→✓12Inference Engines and Serving LLMs in ProductionApr 2026→✓13Retrieval-Augmented Generation (RAG)Apr 2026→✓14Mixture of Experts: Scaling Without the ComputeApr 2026→✓15State Space Models and MambaApr 2026→✓16What's Next: The Frontier of LLM ResearchApr 2026→✓17LLM Interview CheatsheetApr 2026→

Building AI Agents That Work

AI Engineering

6 parts · 6 published

From the agent loop to multi-agent orchestration — how to build agents that survive production, not just demos. Covers tool use, memory, MCP, A2A, and the hardest part: knowing if your agent actually works.

Start reading →

✓01How AI Agents Actually WorkApr 2026→✓02Why Your Agent Works in the Demo and Fails at WorkApr 2026→✓03Agent Memory: How Agents Remember Across ConversationsApr 2026→✓04MCP: The Protocol That Standardized How Agents Use ToolsApr 2026→✓05Multi-Agent Systems: Orchestration Patterns That WorkApr 2026→✓06Evaluating Agents: How to Know If Your Agent Actually WorksApr 2026→

LLMOps: Running AI Systems in Production

AI Engineering

4 parts · 4 published

The operational side of LLMs that nobody talks about until something breaks — monitoring, cost optimization, model routing, and managing prompts like production code.

Start reading →

✓01Monitoring LLMs in Production: What Actually MattersApr 2026→✓02The Economics of Running LLMs: How to Cut Costs Without Killing QualityApr 2026→✓03Model Routing: Sending the Right Query to the Right ModelApr 2026→✓04Prompt Versioning: Managing Prompts Like Production CodeApr 2026→

Stay in the loop

New pieces, straight to your inbox.

Causal inference, experimentation, GenAI. No noise — just the next piece when it's ready.