scripod.com

Reiner Pope – Chip design from the bottom up

Dwarkesh Podcast

Shownote

New blackboard lecture with Reiner Pope: how do chips actually work - starting with basic logic gates, and working up to why GPUs, TPUs, FPGAs, and the human brain each look the way they do. Reiner is CEO of MatX, a new chip startup (full disclosure - I’m...

Highlights

This podcast features a detailed technical discussion on how computer chips, from basic logic gates to advanced AI accelerators, are designed and how they function. The conversation explores the fundamental trade-offs in chip architecture, focusing on the balance between computation and data movement, and compares different processor types including CPUs, GPUs, TPUs, and FPGAs.
09:54
Multiply-accumulate is the core primitive in AI chips.
22:39
Register file costs motivated the shift to tensor cores.
26:10
Quadratic compute growth with only linear communication costs.
48:23
Adding pipeline registers increases clock speed but consumes area
1:01:25
FPGAs are 10x slower than ASICs.
1:03:32
Deterministic latency is possible in CPUs but avoided for market reasons.
1:10:47
Branch prediction enables high clock speeds
1:12:05
Slowing a chip to MHz reduces energy linearly but not 1000x due to idle circuits.
1:18:54
GPUs have higher data movement bandwidth than TPUs

Chapters

Building a multiply-accumulate from logic gates
00:00
Muxes and the cost of data movement
16:31
How systolic arrays work
26:10
Clock cycles and pipeline registers
39:11
FPGAs vs ASICs
51:51
Cache vs scratchpad
1:03:25
Why CPU cores are much bigger than GPU cores
1:07:27
Brains vs chips
1:12:00
A GPU is just a bunch of tiny TPUs
1:15:33

Transcript

Dwarkesh Patel: I'm back with Reiner Pope, who is the CEO of MatX, which is a new AI chip company. Last time we were talking about what happens inside a data center. Now I understand what happens inside an AI chip. How does a chip actually work? Full discl...