scripod.com

Dylan Patel — Deep dive on the 3 big bottlenecks to scaling AI compute

Dwarkesh Podcast
Dylan Patel of SemiAnalysis unpacks the fundamental physical and economic constraints shaping the future of AI compute scaling.
The podcast identifies three core bottlenecks: logic (driven by EUV lithography scarcity, with ASML becoming the top constraint by 2030), memory (a severe HBM/DRAM crunch pushing memory to ~33% of big-tech CapEx, with supply relief delayed until late 2027–2028), and power (not a US bottleneck due to rapid innovation in generation, modular data centers, and behind-the-meter solutions). NVIDIA’s early TSMC allocation secured its dominance, while Google lagged in planning. China is advancing indigenously—projected to match or exceed Western output by 2035—but currently faces a widening compute gap that amplifies US AI leadership. Older fabs can’t substitute for EUV without major efficiency losses, and space-based GPUs remain infeasible this decade. Meanwhile, hedge funds increasingly trade on infrastructure constraints, and geopolitical risk—especially Taiwan’s centrality to advanced chipmaking—poses a systemic threat to global AI progress.
14:22
14:22
GPT-5.4 is cheaper to run, has fewer active parameters, and is higher quality than GPT-4
30:15
30:15
Anthropic saw AI compute demand first and acted before Google realized its own shortfall
34:34
34:34
ASML's EUV tools are the biggest bottleneck towards 2030, as there's no more capacity to shift from mobile and PC industries to AI
1:03:54
1:03:54
As advanced packages scale up, constraints like networking, memory, and cooling emerge—not just transistor density
1:11:03
1:11:03
US labs like OpenAI and Anthropic are scaling compute capacity rapidly, while China is not
1:40:16
1:40:16
Silicon Valley should consider paying deposits for future EUV tool purchase rights to arbitrage AI compute capacity
1:47:27
1:47:27
Half of new power capacity for AI by decade-end may be behind-the-meter, despite higher costs
2:12:21
2:12:21
Smaller models enable faster RL feedback loops and increase compute efficiency
2:16:23
2:16:23
Dylan Patel is noted for his conviction on AGI takeoff and successful trades in memory companies
2:20:53
2:20:53
Huawei, if having access to 3nm, could potentially have a better accelerator than NVIDIA
2:29:33
2:29:33
Replicating TSMC's capacity elsewhere would take so long it would cause a near-zero incremental ability to add compute