A Technical History of Generative Media
Latent Space: The AI Engineer Podcast
2025/09/05
A Technical History of Generative Media
A Technical History of Generative Media

Latent Space: The AI Engineer Podcast
2025/09/05
Shownote
Shownote
Today we are joined by Gorkem and Batuhan from Fal.ai, the fastest growing generative media inference provider. They recently raised a $125M Series C and crossed $100M ARR. We covered how they pivoted from dbt pipelines to diffusion models inference, what were the models that really changed the trajectory of image generation, and the future of AI videos. Enjoy!
00:00 - Introductions
04:58 - History of Major AI Models and Their Impact on Fal.ai
07:06 - Pivoting to Generative Media and Strategic Business Decisions
10:46 - Technical discussion on CUDA optimization and kernel development
12:42 - Inference Engine Architecture and Kernel Reusability
14:59 - Performance Gains and Latency Trade-offs
15:50 - Discussion of model latency importance and performance optimization
17:56 - Importance of Latency and User Engagement
18:46 - Impact of Open Source Model Releases and Competitive Advantage
19:00 - Partnerships with closed source model developers
20:06 - Collaborations with Closed-Source Model Providers
21:28 - Serving Audio Models and Infrastructure Scalability
22:29 - Serverless GPU infrastructure and technical stack
23:52 - GPU Prioritization: H100s and Blackwell Optimization
25:00 - Discussion on ASICs vs. General Purpose GPUs
26:10 - Architectural Trends: MMDiTs and Model Innovation
27:35 - Rise and Decline of Distillation and Consistency Models
28:15 - Draft Mode and Streaming in Image Generation Workflows
29:46 - Generative Video Models and the Role of Latency
30:14 - Auto-Regressive Image Models and Industry Reactions
31:35 - Discussion of OpenAI's Sora and competition in video generation
34:44 - World Models and Creative Applications in Games and Movies
35:27 - Video Models’ Revenue Share and Open-Source Contributions
36:40 - Rise of Chinese Labs and Partnerships
38:03 - Top Trending Models on Hugging Face and ByteDance's Role
39:29 - Monetization Strategies for Open Models
40:48 - Usage Distribution and Model Turnover on FAL
42:11 - Revenue Share vs. Open Model Usage Optimization
42:47 - Moderation and NSFW Content on the Platform
44:03 - Advertising as a key use case for generative media
45:37 - Generative Video in Startup Marketing and Virality
46:56 - LoRA Usage and Fine-Tuning Popularity
47:17 - LoRA ecosystem and fine-tuning discussion
49:25 - Post-Training of Video Models and Future of Fine-Tuning
50:21 - ComfyUI Pipelines and Workflow Complexity
52:31 - Requests for startups and future opportunities in the space
53:33 - Data Collection and RedPajama-Style Initiatives for Media Models
53:46 - RL for Image and Video Models: Unknown Potential
55:11 - Requests for Models: Editing and Conversational Video Models
57:12 - VO3 Capabilities: Lip Sync, TTS, and Timing
58:23 - Bitter Lesson and the Future of Model Workflows
58:44 - FAL's hiring approach and team structure
59:29 - Team Structure and Scaling Applied ML and Performance Teams
1:01:41 - Developer Experience Tools and Low-Code/No-Code Integration
1:03:04 - Improving Hiring Process with Public Challenges and Benchmarks
1:04:02 - Closing Remarks and Culture at FAL
Highlights
Highlights
In this episode, we hear from Gorkem and Batuhan of Fal.ai, a leading generative media inference platform that has rapidly scaled to serve 2 million developers, host 350 models, and achieve $100M ARR—recently backed by a $125M Series C. The conversation centers on their technical evolution, strategic pivots, and vision for the future of AI-generated images and video.
Chapters
Chapters
Introductions
00:00History of Major AI Models and Their Impact on Fal.ai
04:58Pivoting to Generative Media and Strategic Business Decisions
07:06Technical discussion on CUDA optimization and kernel development
10:46Inference Engine Architecture and Kernel Reusability
12:42Performance Gains and Latency Trade-offs
14:59Discussion of model latency importance and performance optimization
15:50Importance of Latency and User Engagement
17:56Impact of Open Source Model Releases and Competitive Advantage
18:46Partnerships with closed source model developers
19:00Collaborations with Closed-Source Model Providers
20:06Serving Audio Models and Infrastructure Scalability
21:28Serverless GPU infrastructure and technical stack
22:29GPU Prioritization: H100s and Blackwell Optimization
23:52Discussion on ASICs vs. General Purpose GPUs
25:00Architectural Trends: MMDiTs and Model Innovation
26:10Rise and Decline of Distillation and Consistency Models
27:35Draft Mode and Streaming in Image Generation Workflows
28:15Generative Video Models and the Role of Latency
29:46Auto-Regressive Image Models and Industry Reactions
30:14Discussion of OpenAI's Sora and competition in video generation
31:35World Models and Creative Applications in Games and Movies
34:44Video Models’ Revenue Share and Open-Source Contributions
35:27Rise of Chinese Labs and Partnerships
36:40Top Trending Models on Hugging Face and ByteDance's Role
38:03Monetization Strategies for Open Models
39:29Usage Distribution and Model Turnover on FAL
40:48Revenue Share vs. Open Model Usage Optimization
42:11Moderation and NSFW Content on the Platform
42:47Advertising as a key use case for generative media
44:03Generative Video in Startup Marketing and Virality
45:37LoRA Usage and Fine-Tuning Popularity
46:56LoRA ecosystem and fine-tuning discussion
47:17Post-Training of Video Models and Future of Fine-Tuning
49:25ComfyUI Pipelines and Workflow Complexity
50:21Requests for startups and future opportunities in the space
52:31Data Collection and RedPajama-Style Initiatives for Media Models
53:33RL for Image and Video Models: Unknown Potential
53:46Requests for Models: Editing and Conversational Video Models
55:11VO3 Capabilities: Lip Sync, TTS, and Timing
57:12Bitter Lesson and the Future of Model Workflows
58:23FAL's hiring approach and team structure
58:44Team Structure and Scaling Applied ML and Performance Teams
59:29Transcript
Transcript
Alessio: Hey everyone, welcome to the Latent Space podcast. This is Alessio, founder of Kernel Labs, and I'm joined by Spix, founder of Small AI.
Spix: Hello, hello. Today, we're so excited to be in the studio with Gorkem and Batuhan of Fal.ai. Welcome.
...