scripod.com

#490 – State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI

Shownote

Nathan Lambert and Sebastian Raschka are machine learning researchers, engineers, and educators. Nathan is the post-training lead at the Allen Institute for AI (Ai2) and the author of The RLHF Book. Sebastian Raschka is the author of Build a Large Language...

Highlights

This episode features a deep, wide-ranging conversation with AI researchers Nathan Lambert and Sebastian Raschka, exploring the technical, cultural, and philosophical dimensions of artificial intelligence as it stands in 2026.
00:00
Sebastian Raschka has written two recommended books on building models from scratch
15:10
During a jungle journey with Paul and Rosalie, severe dehydration created an intense craving for electrolyte drinks.
16:29
DeepSeek R1 surprised the AI world with high performance at low cost in January 2025
35:26
Chinese models use fewer GPUs per replica, making them slower with different errors
41:28
Cloud Code is favorably compared to Codex for AI interaction and avoids low-level work
48:36
Chinese open-weight models are popular because of their unrestricted open-source licenses, unlike Llama or Gemini which impose usage restrictions
1:00:24
Faster training via FP8 and improved tokens-per-second-per-GPU enables rapid experimentation but does not yield new model capabilities
1:02:44
Most low-hanging fruit in reinforcement learning with verifiable rewards and inference time scaling has already been taken.
1:24:01
Claude 3 achieved better performance with less data, highlighting the importance of data quality
1:51:51
RLVR enables iterative generate-grade loops where model behavior is learned through accuracy on verifiable tasks like math and coding
2:20:17
The core of RLHF shows its unsolvability as it assumes preferences can be quantified, related to the von Neumann-Morgenstern utility theorem
2:35:36
The '996' work culture originating in China is now adopted in AI companies in Silicon Valley
2:42:33
'The Season of the Witch' reveals pivotal San Francisco history—from the hippie revolution to the HIV/AIDS crisis—that many locals, including the speaker, were unaware of.
2:48:31
Tool use is hindering models from being general-purpose, and it's unclear how to interrupt the autoregressive chain with external tools in a diffusion setup
2:51:35
Solving open-model tooling could lead to more flexible and innovative models
2:53:17
Continual learning is essential because rising model training costs make frequent full retraining unsustainable
3:03:58
Sliding window attention is currently considered the safest and most cost-effective approach because it ensures no information is missed
3:04:59
World models in the LLM space are getting more attention and will be useful in the coming year
3:14:11
AI is 'jagged': excelling in some areas and lacking in others, especially near automated software engineering
3:21:20
LLMs will eventually solve coding like calculators solve calculating
3:45:35
LLMs deliver unique value when timely, customized synthesis is needed and no dense authoritative source exists
3:49:22
Starting the advertising flywheel in AI apps is a long-term and risky bet
3:54:07
The speaker wishes more big US AI startups would go public to show how they spend money and give people investment access
4:04:06
The Atom Project is a US-based initiative to build and host high-quality open-weight AI models to compete with China's open-source AI ecosystem
4:13:35
A 'Manhattan Project' for open-source AI is unlikely and low-risk because open-source models pose no civilizational threat comparable to nuclear weapons
4:17:28
Without Jensen, the deep learning revolution could have been significantly delayed
4:34:54
Humans retain agency over AI—it is a tool, not an autonomous adversary; in any human-machine conflict, humans would win

Chapters

Introduction
00:00
Sponsors, Comments, and Reflections
01:39
China vs US: Who wins the AI race?
16:29
ChatGPT vs Claude vs Gemini vs Grok: Who is winning?
25:11
Best AI for coding
36:11
Open Source vs Closed Source LLMs
43:02
Transformers: Evolution of LLMs since 2019
54:41
AI Scaling Laws: Are they dead or still holding?
1:02:38
How AI is trained: Pre-training, Mid-training, and Post-training
1:18:45
Post-training explained: Exciting new research directions in LLMs
1:51:51
Advice for beginners on how to get into AI development & research
2:12:43
Work culture in AI (72+ hour weeks)
2:35:36
Silicon Valley bubble
2:39:22
Text diffusion models and other new research directions
2:43:19
Tool use
2:49:01
Continual learning
2:53:17
Long context
2:58:39
Robotics
3:04:54
Timeline to AGI
3:14:04
Will AI replace programmers?
3:21:20
Is the dream of AGI dying?
3:39:51
How AI will make money?
3:46:40
Big acquisitions in 2026
3:51:02
Future of OpenAI, Anthropic, Google DeepMind, xAI, Meta
3:55:34
Manhattan Project for AI
4:08:08
Future of NVIDIA, GPUs, and AI compute clusters
4:14:42
Future of human civilization
4:22:48

Transcript

Lex Fridman: The following is a conversation all about the state-of-the-art in artificial intelligence, including some of the exciting technical breakthroughs and developments in AI. That happened over the past year, and some of the interesting things we t...