Chapter · AI

Training Large Models

How a model that can do anything is actually built. Pre-training at scale, the laws that predict what bigger gets you, and the fine-tuning and RL steps that turn raw next-token prediction into a useful assistant.

Topics
Topic 1

Pre-training

Learning the world from a next-token-prediction objective at scale.

Planned
Topic 2

Scaling Laws

The clean power-laws that predict what bigger models, more data, and more compute will buy you.

Planned
Topic 3

Training Data

Where the trillions of tokens come from, and why curation matters as much as quantity.

Planned
Topic 4

Fine-Tuning

Specializing a pre-trained model on a downstream task or domain.

Planned
Topic 5

PEFT & LoRA

Tuning a fraction of the parameters and getting most of the gain.

Planned
Topic 6

RLHF

Turning human preferences into a reward signal, and then into a better model.

Planned
Topic 7

DPO & Preference Optimization

Skipping the reward model and optimizing on preferences directly.

Planned
Topic 8

RLVR & Verifiable Rewards

When you can grade the answer, reinforcement learning gets a lot simpler — and more powerful.

Planned
Topic 9

Constitutional AI

Replacing human feedback with a model critiquing itself against a written constitution.

Planned