SAM 3: The Eyes for AI — Nikhila & Pengchuan (Meta Superintelligence), ft. Joseph Nelson (Roboflow)
Latent Space: The AI Engineer Podcast
2025/12/18
SAM 3: The Eyes for AI — Nikhila & Pengchuan (Meta Superintelligence), ft. Joseph Nelson (Roboflow)
SAM 3: The Eyes for AI — Nikhila & Pengchuan (Meta Superintelligence), ft. Joseph Nelson (Roboflow)

Latent Space: The AI Engineer Podcast
2025/12/18
Shownote
Shownote
as with all demo-heavy and especially vision AI podcasts, we encourage watching along on our YouTube (and tossing us an upvote/subscribe if you like!)
From SAM 1's 11-million-image data engine to SAM 2's memory-based video tracking, MSL’s Segment Anything...
Highlights
Highlights
The latest evolution in computer vision has arrived with SAM 3, a model that transforms how machines understand visual content through natural language. In this discussion, leading researchers and practitioners explore how this technology achieves unprecedented precision and speed in identifying and tracking objects across images and video—without relying on traditional annotation methods.
Chapters
Chapters
What makes SAM 3 a game-changer in real-time visual understanding?
00:00How does teaching AI to recognize 200,000+ everyday concepts change segmentation forever?
11:45Why is combining SAM 3 with large language models the key to smarter vision?
26:42How did AI cut annotation time from minutes to seconds—and what’s still hard?
39:01Should vision models do everything themselves, or work with tools?
54:39Can SAM 3 help robots see and reason like humans?
1:00:30What tools are needed to turn breakthrough research into real-world impact?
1:11:45Transcript
Transcript
Joseph Nelson: Okay, we're here in the remote studio with the grand return of the RoboFlow and Latent Space and SAM combo. Welcome to Joseph, my sort of vision co-host, I guess. Thanks. Great to be here. Welcome back. We also have, welcome back, Nikhila Ra...