Claude Opus 4.8 is here. Is it as good as they say?

How I AI

May 28

Overview Shownote Highlights Transcript Chapters Pins

In this episode, Claire Vo shares her early hands-on experience with Anthropic's newly released Claude Opus 4.8 model. She walks through a series of real-world tests in coding, design, and business strategy, offering an unfiltered look at where the model shines and where it falls short.

Claire found Opus 4.8 impressive for greenfield prototypes and one-shot feature implementation, autonomously coding a working prototype in about 20 minutes. However, it consistently struggled with the 'last 10%' of tasks, including edge cases, finishing touches, and bug hunting, where it often hallucinated by fabricating information. On existing codebases, it required multiple cycles to fix bugs and lacked ambition in creative coding tasks like building a game. In business strategy tests, Opus 4.7 outperformed 4.8 by using data contextually and providing grounded insights, while 4.8 over-rotated on small data points and produced hand-wavy roadmaps. Claire recommends Opus 4.8 for fast, one-shot tasks and new projects but advises caution with existing codebases and strategic work. New features like dynamic workflows and effort control are also highlighted.