scripod.com

#490 – State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI

Overview

Shownote

Highlights

Transcript

Chapters

Pins

#490 – State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI

Lex Fridman Podcast

Feb 01

#490 – State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI

#490 – State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI

Lex Fridman Podcast

Lex Fridman Podcast

Feb 01

Overview Shownote Highlights Transcript Chapters Pins

Shownote

Nathan Lambert and Sebastian Raschka are machine learning researchers, engineers, and educators. Nathan is the post-training lead at the Allen Institute for AI (Ai2) and the author of The RLHF Book. Sebastian Raschka is the author of Build a Large Language...

Highlights

This episode features a deep, wide-ranging conversation with AI researchers Nathan Lambert and Sebastian Raschka, exploring the technical, cultural, and philosophical dimensions of artificial intelligence as it stands in 2026.

00:00

Sebastian Raschka has written two recommended books on building models from scratch

15:10

During a jungle journey with Paul and Rosalie, severe dehydration created an intense craving for electrolyte drinks.

16:29

DeepSeek R1 surprised the AI world with high performance at low cost in January 2025

35:26

Chinese models use fewer GPUs per replica, making them slower with different errors

41:28

Cloud Code is favorably compared to Codex for AI interaction and avoids low-level work

48:36

Chinese open-weight models are popular because of their unrestricted open-source licenses, unlike Llama or Gemini which impose usage restrictions

1:00:24

Faster training via FP8 and improved tokens-per-second-per-GPU enables rapid experimentation but does not yield new model capabilities

1:02:44

Most low-hanging fruit in reinforcement learning with verifiable rewards and inference time scaling has already been taken.

1:24:01

Claude 3 achieved better performance with less data, highlighting the importance of data quality

1:51:51

RLVR enables iterative generate-grade loops where model behavior is learned through accuracy on verifiable tasks like math and coding

2:20:17

The core of RLHF shows its unsolvability as it assumes preferences can be quantified, related to the von Neumann-Morgenstern utility theorem

2:35:36

The '996' work culture originating in China is now adopted in AI companies in Silicon Valley

2:42:33

'The Season of the Witch' reveals pivotal San Francisco history—from the hippie revolution to the HIV/AIDS crisis—that many locals, including the speaker, were unaware of.

2:48:31

Tool use is hindering models from being general-purpose, and it's unclear how to interrupt the autoregressive chain with external tools in a diffusion setup

2:51:35

Solving open-model tooling could lead to more flexible and innovative models

2:53:17

Continual learning is essential because rising model training costs make frequent full retraining unsustainable

3:03:58

Sliding window attention is currently considered the safest and most cost-effective approach because it ensures no information is missed

3:04:59

World models in the LLM space are getting more attention and will be useful in the coming year

3:14:11

AI is 'jagged': excelling in some areas and lacking in others, especially near automated software engineering

3:21:20

LLMs will eventually solve coding like calculators solve calculating

3:45:35

LLMs deliver unique value when timely, customized synthesis is needed and no dense authoritative source exists

3:49:22

Starting the advertising flywheel in AI apps is a long-term and risky bet

3:54:07

The speaker wishes more big US AI startups would go public to show how they spend money and give people investment access

4:04:06

The Atom Project is a US-based initiative to build and host high-quality open-weight AI models to compete with China's open-source AI ecosystem

4:13:35

A 'Manhattan Project' for open-source AI is unlikely and low-risk because open-source models pose no civilizational threat comparable to nuclear weapons

4:17:28

Without Jensen, the deep learning revolution could have been significantly delayed

4:34:54

Humans retain agency over AI—it is a tool, not an autonomous adversary; in any human-machine conflict, humans would win

Chapters

Introduction

00:00

Sponsors, Comments, and Reflections

01:39

China vs US: Who wins the AI race?

16:29

ChatGPT vs Claude vs Gemini vs Grok: Who is winning?

25:11

Best AI for coding

36:11

Open Source vs Closed Source LLMs

43:02

Transformers: Evolution of LLMs since 2019

54:41

AI Scaling Laws: Are they dead or still holding?

1:02:38

How AI is trained: Pre-training, Mid-training, and Post-training

1:18:45

Post-training explained: Exciting new research directions in LLMs

1:51:51

Advice for beginners on how to get into AI development & research

2:12:43

Work culture in AI (72+ hour weeks)

2:35:36

Silicon Valley bubble

2:39:22

Text diffusion models and other new research directions

2:43:19

Tool use

2:49:01

Continual learning

2:53:17

Long context

2:58:39

Robotics

3:04:54

Timeline to AGI

3:14:04

Will AI replace programmers?

3:21:20

Is the dream of AGI dying?

3:39:51

How AI will make money?

3:46:40

Big acquisitions in 2026

3:51:02

Future of OpenAI, Anthropic, Google DeepMind, xAI, Meta

3:55:34

Manhattan Project for AI

4:08:08

Future of NVIDIA, GPUs, and AI compute clusters

4:14:42

Future of human civilization

4:22:48

Transcript

Lex Fridman: The following is a conversation all about the state-of-the-art in artificial intelligence, including some of the exciting technical breakthroughs and developments in AI. That happened over the past year, and some of the interesting things we t...