scripod.com

The Future of Audio AI: Insights from Mati Staniszewski of ElevenLabs

Shownote

In this episode of AI Native Dev, hosts Guy Podjarny sits down with Mati Staniszewski, the visionary CEO and co-founder of ElevenLabs, a leader in AI audio technology. Mati shares the origin story of ElevenLabs, detailing how a frustration with subpar dubb...

Highlights

In this episode, Guy Podjarny welcomes Mati Staniszewski, CEO and co-founder of ElevenLabs, a company transforming the landscape of AI audio technology. Mati recounts how a simple frustration with poor movie dubbing in his native Poland led to the creation of a powerful platform that now serves millions of users worldwide. The conversation explores the technical evolution of ElevenLabs, the challenges of building emotionally expressive AI voices, and the company’s vision for the future of audio AI.
00:00
ElevenLabs raised $180 million and serves millions of users generating a thousand years of audio
14:44
Introduced text-to-speech and voice cloning with model-driven voice component decisions
32:35
Conversational AI is seen as the future of digital interactions, with potential in healthcare, 911 training, and customer support.
35:25
ElevenLabs simultaneously releases interfaces and APIs, serving as a horizontal layer for audio AI while offering end-to-end solutions in conversational AI and media entertainment.
45:52
Real-time cross-language communication is the holy grail for audio AI applications.

Chapters

From Frustration to Innovation: The Birth of ElevenLabs
00:00
Building Emotion into AI Voices: Technical Breakthroughs
14:44
Research Meets Real-World Use: Balancing Innovation and Practicality
25:17
Serving Diverse Industries: Platform Design and Strategic Applications
35:25
Voice of the Future: ElevenLabs’ Vision and Cultural Blueprint
43:19

Transcript

Mati Staniszewski: What you would try to do is effectively map a set of characteristics of that voice in the model. So, try to predict what's the gender of the voice, what's the potential age group of the voice, what's the emotionality of the voice. And yo...