scripod.com

Memory and Continual Learning: Engram's Dan Biderman and Jessy Lin

Overview

Shownote

Highlights

Transcript

Chapters

Pins

Memory and Continual Learning: Engram's Dan Biderman and Jessy Lin

Training Data

1 DAYS AGO

Memory and Continual Learning: Engram's Dan Biderman and Jessy Lin

Memory and Continual Learning: Engram's Dan Biderman and Jessy Lin

Training Data

Training Data

1 DAYS AGO

Overview Shownote Highlights Transcript Chapters Pins

Shownote

Dan Biderman and Jessy Lin, co-founders of Engram, are building a neolab around memory and continual learning, which they call two sides of the same coin. Their contrarian premise: instead of stuffing ever-larger prompts into the context window or bolting ...

Highlights

This podcast explores a contrarian approach to AI, focusing on memory and continual learning as the key to making models truly useful for specific teams and companies. Instead of relying on ever-larger context windows or retrieval systems, the discussion centers on baking knowledge directly into a model's weights, allowing it to learn and improve over time like a seasoned employee.

00:00

Memory and continual learning are two sides of the same coin

08:17

Internalizing facts enables deeper reasoning

10:55

Deep learning merged knowledge and processing, but they are separating again.

18:49

Human memory's fuzzy representations inspire new AI approaches

21:28

Scaling compute on novel contexts over emergent capabilities

27:06

The key challenge is not storage but knowing how to query the right information

Chapters

Memory and Learning: Two Sides of the Same Coin

00:00

The 100x Token Advantage: Fine-Tuning for Efficiency

05:45

The Great Divide: Factual Knowledge vs. Algorithmic Processing

10:55

Why Frontier Labs Aren't Solving the Memory Problem

16:04

Is Memory an Emergent Property or a Distinct Component?

21:28

The Unsolved Problem: What to Internalize vs. What to Retrieve

27:06

Transcript

Jessy Lin: What about pre-training, or even post-training? Makes it possible for the models to generalize in these magical, emergent ways and controlling that process so that a company has a set of private data? How do we make the models learn that just as...