scripod.com

Eric Jang – Building AlphaGo from scratch

Overview

Shownote

Highlights

Transcript

Chapters

Pins

Eric Jang – Building AlphaGo from scratch

Dwarkesh Podcast

May 15

Eric Jang – Building AlphaGo from scratch

Eric Jang – Building AlphaGo from scratch

Dwarkesh Podcast

Dwarkesh Podcast

May 15

Overview Shownote Highlights Transcript Chapters Pins

Shownote

Eric Jang walks through how to build AlphaGo from scratch, but with modern AI tools. Sometimes you understand the future better by stepping backward. AlphaGo is still the cleanest worked example of the primitives of intelligence: search, learning from exp...

Highlights

In this episode, Eric Jang revisits AlphaGo not as a historical artifact, but as a pedagogical and architectural blueprint for understanding intelligence—particularly how search, learning from experience, and self-play interact to solve problems with vast combinatorial spaces.

02:43

Players can intentionally let opponents capture stones to gain greater advantage elsewhere on the board

11:14

AlphaGo's breakthrough was using neural nets to make the search problem tractable

54:07

MCTS recursively improves its own neural predictions by updating node values and visit counts through backup

1:17:32

Neural networks amortize computation to solve NP-hard problems, challenging traditional hardness assumptions

1:42:24

MCTS and Q-learning share a recursive dynamic programming property that enables value estimation without explicit search.

2:00:02

Strong initialization against Katago reduces the need for architectural tricks and auxiliary supervision objectives

2:07:23

MCTS relabeling replaces target network computation and has a stabilizing effect while better saturating the GPU

2:21:33

In local minima with flat signals, the win rate curve of an MCTS policy versus the raw network provides a clean supervision signal

2:25:22

Mythos-class models and Go-inspired RL environments offer promising paths toward verifiable AI self-improvement

Chapters

Basics of Go

00:00

Monte Carlo Tree Search

08:17

What the neural network does

32:04

Self-play

1:00:33

Alternative RL approaches

1:25:38

Why doesn't MCTS work for LLMs

1:45:47

Off-policy training

2:01:09

RL is even more information inefficient than you thought

2:12:02

Automated AI researchers

2:22:16

Transcript

Dwarkesh Patel: Today, I'm here with Eric Jang, who was most recently vice president of AI at 1x Technologies, before that senior research scientist at what is now Google DeepMind Robotics. And you've been on sabbatical for the last few months. One of the ...