scripod.com

Cerebras IPO

Semi Doped
This podcast discusses the recent Cerebras IPO, which saw its stock price surge nearly 70% on the first day. The conversation explores the unique technology behind Cerebras' wafer-scale chip, its engineering challenges, and its market position in the AI inference space.
The hosts explain that Cerebras keeps an entire silicon wafer intact as a single, massive chip, unlike traditional chips that are diced into smaller pieces. This design provides 44 GB of on-chip SRAM and immense memory bandwidth, enabling fast LLM inference for models that fit within that memory. However, the 23 kW power draw and thermal expansion of the wafer present significant cooling and alignment challenges. The discussion contrasts Cerebras' success with the failed attempt by Trilogy Systems in the 1980s to build a similar wafer-scale chip. Cerebras has evolved from supercomputing to focus on inference, where its low latency is a key advantage for applications like coding and trading. The hosts also analyze the OpenAI deal, noting that Cerebras is paid for tokens rather than hardware, and highlight the risks of its novel supply chain as it scales to compete with Nvidia and Groq in the 'Wild West' of AI inference.
00:00
00:00
Each Cerebras wafer consumes 23 kilowatts of power
00:20
00:20
Small steps lead to big changes
05:23
05:23
They route around defective cores to achieve ~900,000 working ones.
10:32
10:32
23 kW power draw requires vertical connectors across hundreds of points
18:12
18:12
Small models on a single wafer achieve unmatched token speeds
29:09
29:09
Cerebras succeeded after 40 years
34:48
34:48
Low latency is key for coding and trading.
47:17
47:17
The AI inference market is a 'Wild West'.