Reiner Pope – The math behind how LLMs are trained and served
Dwarkesh Podcast
1 DAYS AGO
Reiner Pope – The math behind how LLMs are trained and served
Reiner Pope – The math behind how LLMs are trained and served

Dwarkesh Podcast
1 DAYS AGO
Shownote
Shownote
Did a very different format with Reiner Pope - a blackboard lecture where he walks through how frontier LLMs are trained and served. It’s shocking how much you can deduce about what the labs are doing from a handful of equations, public API prices, and so...
Highlights
Highlights
In this episode, Reiner Pope delivers an insightful blackboard-style lecture unpacking the hardware-aware realities of training and serving frontier large language models—using first-principles reasoning, public pricing data, and fundamental constraints of modern GPU architecture.
Chapters
Chapters
How batch size affects token cost and speed
00:00How MoE models are laid out across GPU racks
32:09How pipeline parallelism spreads model layers across racks
47:12Why Ilya said, “As we now know, pipelining is not wise.”
1:03:37Because of RL, models may be 100x over-trained beyond Chinchilla-optimal
1:18:59Deducing long context memory costs from API pricing
1:33:02Convergent evolution between neural nets and cryptography
2:04:02Transcript
Transcript
Dwarkesh Patel: Today, I'm interviewing Reiner Pope, who is CEO of MatX, which is a new chip startup. Previously, he was doing TPU architecture and many other things at Google. This is a very different format from my usual interviews. This is going to be a...