scripod.com

[State of RL/Reasoning] IMO/IOI Gold, OpenAI o3/GPT-5, and Cursor Composer — Ashvin Nair, Cursor

Shownote

From Berkeley robotics and OpenAI's 2017 Dota-era internship to shipping RL breakthroughs on GPT-4o, o1, and o3, and now leading model development at Cursor, Ashvin Nair has done it all. We caught up with Ashvin at NeurIPS 2025 to dig into the inside story...

Highlights

In this conversation, Ashvin Nair traces his evolution from robotics to the forefront of language model development, offering a candid look at the shifting landscape of AI research and deployment. He reflects on pivotal moments in his career and the broader industry, revealing how expectations, methodologies, and organizational structures have adapted—or failed to adapt—to rapid technological change.
00:00
Robotics people are more grounded than other AI researchers.
14:27
OpenAI has abandoned the 'one model fits all' approach as of this year.
17:08
The current reasoning paradigm in AI may stem from organizational misalignment rather than technical limits.
29:32
Human-level intelligence might be reached around 2030.
39:23
Learning from a small number of deployment tokens doesn't overload a model trained on trillions.
42:10
Why is Off-policy RL unstable? is suggested as a strong interview question.

Chapters

From robots to language models: what changed in AI’s trajectory?
00:00
Why did reinforcement learning fall short—and what’s different now?
09:09
How leadership turmoil shaped OpenAI’s technical path
17:08
The quiet comeback of reinforcement learning in 2023
21:58
How Cursor builds smarter models through real-world feedback
32:17
Are neural networks processors or storage? And does it even matter?
42:10

Transcript

Speaker 1: Okay, we're here at NeurIPS. We're recording a special Latent Space coverage of the folks at NeurIPS. And we're here with Ashvin Nair from Cursor. Welcome. Speaker 2: Hi. Yeah, thanks for having me. Speaker 1: So, I guess the, like, Ashvin Nai...