Information Theory for Language Models: Jack Morris

Latent Space: The AI Engineer Podcast

2025/07/02

Information Theory for Language Models: Jack Morris

Latent Space: The AI Engineer Podcast

2025/07/02

Overview Shownote Highlights Transcript Chapters Pins

In this episode, we sit down with Jack Morris, a PhD student at Cornell Tech whose research focuses on the information-theoretic foundations of large language models. Unlike many of his peers who focus on trending topics like AI agents or benchmarking, Jack delves into the deeper mechanics of how models store and process information. His work spans embeddings, model inversion, and the surprising role of datasets in driving AI innovation. This conversation offers a unique window into some of the most underappreciated yet critical aspects of modern AI research.

Jack Morris discusses his research journey and key contributions to understanding large language models from an information-theoretic perspective. He explores how information is stored and compressed within models, highlighting that GPT-style architectures store around 3.6 bits per parameter. The conversation also covers embedding inversion, where text can be reconstructed from vector representations with high accuracy, revealing the surprising richness of embedded information. Jack introduces the idea that major AI breakthroughs often stem from new datasets rather than novel methods, citing examples like BERT and transformers. He also touches on model universality, drawing parallels to computer vision techniques like CycleGAN, and emphasizes the importance of adapting to evolving AI education and engineering practices.