Chapter · AI

Training Large Models

How a model that can do anything is actually built. Pre-training at scale, the laws that predict what bigger gets you, and the fine-tuning and RL steps that turn raw next-token prediction into a useful assistant.

Topics

Topic 1

Pre-training

Learning the world from a next-token-prediction objective at scale.

Planned

Topic 2

Scaling Laws

The clean power-laws that predict what bigger models, more data, and more compute will buy you.

Planned

Topic 3

Training Data

Where the trillions of tokens come from, and why curation matters as much as quantity.

Planned

Topic 4

Fine-Tuning

Specializing a pre-trained model on a downstream task or domain.

Planned

Topic 5

PEFT & LoRA

Tuning a fraction of the parameters and getting most of the gain.

Planned

Topic 6

RLHF

Turning human preferences into a reward signal, and then into a better model.

Planned

Topic 7

DPO & Preference Optimization

Skipping the reward model and optimizing on preferences directly.

Planned

Topic 8

RLVR & Verifiable Rewards

When you can grade the answer, reinforcement learning gets a lot simpler — and more powerful.

Planned

Topic 9

Constitutional AI

Replacing human feedback with a model critiquing itself against a written constitution.

Planned