He Co-Invented the Transformer. Now: Continuous Thought Machines - Llion Jones and Luke Darlow [Sakana AI]
Machine Learning Street Talk (MLST)
2025/11/23
He Co-Invented the Transformer. Now: Continuous Thought Machines - Llion Jones and Luke Darlow [Sakana AI]
He Co-Invented the Transformer. Now: Continuous Thought Machines - Llion Jones and Luke Darlow [Sakana AI]

Machine Learning Street Talk (MLST)
2025/11/23
The current trajectory of AI development, dominated by Transformer-based models, may be limiting the field's potential for achieving genuine reasoning capabilities. In this discussion, pioneers Llion Jones and Luke Darlow challenge the status quo, arguing that the industry’s reliance on a single architecture risks overlooking more biologically grounded and adaptive forms of intelligence.
The conversation critiques the dominance of Transformers, suggesting the field is trapped in a local minimum where scaling masks fundamental flaws in representation and reasoning. Using the 'spiral' analogy, the speakers illustrate how models mimic patterns without understanding them. They introduce the Continuous Thought Machine (CTM), a biologically inspired architecture that enables step-by-step reasoning, adaptive computation time, and natural backtracking—behaviors absent in standard LLMs. Unlike fixed-step models, CTM allocates thinking time dynamically, mimicking human-like problem solving. The model’s design supports uncertainty awareness and iterative correction, demonstrated through maze-solving and Sudoku Bench, a new benchmark emphasizing meta-reasoning and real-time strategy adaptation. The discussion advocates for research freedom and long-term exploration over benchmark chasing, positioning CTM as a path toward more authentic machine cognition.
04:13
04:13
Research freedom is essential for breakthrough innovation in AI.
17:30
17:30
Current AI shows jagged intelligence, revealing fundamental flaws in architecture
22:24
22:24
Scaling masks AI's inability to truly understand patterns like spirals
35:36
35:36
No trained model can predict 100–200 steps down the maze path in one shot.
36:01
36:01
CTM naturally allocates more time to difficult problems without forced computation limits.
57:45
57:45
Model shows leapfrogging behavior under constrained thinking time
1:04:22
1:04:22
Progress on Sudoku Bench would signify meaningful AI advancement.