How Braintrust uses AI agents, evals, and CI to ship better software | Ankur Goyal
How I AI
23 HOURS AGO
How Braintrust uses AI agents, evals, and CI to ship better software | Ankur Goyal
How Braintrust uses AI agents, evals, and CI to ship better software | Ankur Goyal

How I AI
23 HOURS AGO
Shownote
Shownote
In this episode, I sit down with Ankur Goyal, founder and CEO of Braintrust, the AI evals and observability platform used by teams like Notion, Stripe, Vercel, and Zapier. This one is for the senior engineers, staff engineers, VPs of engineering, and CTOs ...
Highlights
Highlights
This podcast explores how AI agents are transforming the work of senior engineers and technical leaders, moving beyond simple code generation to tackle complex infrastructure and architecture problems. The conversation demystifies the concept of evals, presenting them as a modern, quantifiable version of a product requirements document that allows models to determine the 'how' while engineers focus on the 'what'. The discussion also provides a practical framework for deciding which tasks to delegate to AI agents and how to capture subjective human taste into repeatable, scalable evaluation systems.
Chapters
Chapters
Introduction to Ankur Goyal
00:00Using AI agents for database optimization
03:00Running exhaustive benchmarks with coding agents
06:10Why staff engineers are wrong about AI limitations
09:03The “agent line” framework for delegation
11:30Ankur’s workflow: running 4 to 6 concurrent agents
14:00Technical setup: foreground agents, background agents, and cloud environments
17:16Spending time with AI tools
20:32Demystifying evals
23:06Live demo: Building an eval for documentation answers
26:02The alternative to evals: vibe checks and whack-a-mole
30:20Capturing designer taste in scoring functions
32:09Quick recap
33:13Managing velocity and throughput
33:44Why CI/CD investment is critical for AI-accelerated teams
35:40Ankur’s prompting strategy when agents fail
37:30Closing thoughts and how to connect
39:10Transcript
Transcript
Claire Vo: And still, as I say, the year of our cloud, 2026, I still talk to engineers that say AI on our most complicated things cannot do a good job.
Ankur Goyal: I so viscerally disagree with it. There's no staff engineer who is running as many rigorou...