DeepMind Founder Interview: AGI Architecture, Agent Status, and Scientific Breakthroughs in the Next Decade

Original video title: Demis Hassabis: Agents, AGI & The Next Big Scientific Breakthrough

Original source: Y Combinator
Original compilation: Deep潮 TechFlow

Editor’s Introduction

Google DeepMind CEO and Nobel Prize in Chemistry winner Demis Hassabis visits Y Combinator to discuss key advancements toward AGI, advice for entrepreneurs on how to stay ahead, and where the next major scientific breakthrough might occur.

A very practical judgment for deep tech entrepreneurs is that if you start a ten-year deep tech project today, you must include the emergence of AGI in your planning. He also revealed that Isomorphic Labs (a biotech AI spin-off from DeepMind) will have major news soon.

Key Quotes

Roadmap and Timeline for AGI

· “Almost certainly, these existing technological components will become part of the final AGI architecture.”

· “Problems like continual learning, long-term reasoning, and certain aspects of memory haven’t been solved yet; AGI needs to get all of these right.”

· “If your AGI timeline is around 2030, like mine, and you start a deep tech project today, you must consider that AGI might appear midway.”

Memory and Context Windows

· “The context window roughly corresponds to working memory. Humans have an average of about seven items in working memory, while we have hundreds of thousands or even millions of tokens in our context window. But the problem is, we stuff everything in—including unimportant or incorrect information, which is quite crude.”

· “Processing real-time video streams and storing all tokens would mean a million tokens are only enough for about 20 minutes.”

Limitations of Reasoning

· “I like to test Gemini by playing chess. Sometimes it realizes it’s making a bad move but can’t find a better one, so it circles around and ends up making that bad move. A precise reasoning system shouldn’t have this kind of issue.”

· “On one hand, it can solve IMO gold medal-level problems; on the other, ask it differently and it makes elementary math mistakes. It seems to lack something in self-reflection on its own thinking process.”

Agent and Creativity

· “To achieve AGI, you need a system that can proactively solve problems for you. Agents are the way forward, and I think we’re just getting started.”

· “I haven’t seen anyone use vibe coding to create a top-ranked AAA game. With current effort, it should be possible, but it hasn’t happened yet. That suggests something is missing in tools or workflows.”

Distillation and Small Models

· “Our hypothesis is that a cutting-edge Pro model released in half a year to a year can be compressed into a very small model that runs on edge devices. We haven’t yet hit the theoretical information density limit.”

Scientific Discovery and the “Einstein Test”

· “Sometimes I call it the ‘Einstein Test’—can you train a system with knowledge from 1901 and have it independently derive Einstein’s 1905 results, including special relativity? If it can, these systems are not far from inventing truly new things.”

· “Solving a Millennium Prize problem is already impressive. But even harder is proposing a new set of Millennium Prize problems that top mathematicians consider equally profound and worth a lifetime of research.”

Deep Tech Entrepreneurship Advice

· “Chasing hard problems and simple problems are quite similar; the difficulty just differs in approach. Life is short, so better to focus your energy on things that no one else will do if you don’t.”

Pathways to AGI

Gary Tan: You’ve been thinking about AGI longer than almost anyone. Based on current paradigms, how much of the final AGI architecture do you think we already have? What is fundamentally missing right now?

Demis Hassabis: Large-scale pretraining, RLHF, chain-of-thought—I’m quite sure these will be part of the final AGI architecture. These techniques have proven themselves too much so far. I can’t imagine in two years we’ll find they’re dead ends—that doesn’t make sense to me. But on top of what we have, maybe one or two more things are needed. Continual learning, long-term reasoning, certain aspects of memory—some problems remain unsolved.

AGI needs to be fully solved. Maybe existing tech plus some incremental innovations can extend to that level, but there might still be one or two critical breakthroughs needed. I don’t think more than that. My personal estimate is that the probability of such unresolved key issues is about fifty-fifty. So at DeepMind, we’re pushing both lines.

Gary Tan: I deal with many agent systems, and what shocks me most is that the underlying weights are often the same across different runs. So the concept of continual learning is very interesting because right now we’re basically patching things together with tape, like those “dream cycle” ideas.

Demis Hassabis: Exactly, those dream cycles are pretty cool. We’ve thought about this in the context of integrating episodic memory. My PhD research was on how the hippocampus elegantly integrates new knowledge into existing schemas. The brain does this extremely well.

It does this during sleep, especially during REM sleep, replaying important experiences to learn from them. Our earliest Atari program, DQN (DeepMind’s 2013 deep Q-network that first used deep reinforcement learning to reach human-level performance on Atari games), mastered Atari by using experience replay.

This concept, learned from neuroscience—replaying successful paths repeatedly. That was 2013, quite ancient in AI terms, but it was crucial then.

I agree with you; right now, we’re basically patching things together with tape—stuffing everything into the context window. It doesn’t feel quite right. Even if we’re talking about machines rather than biological brains, theoretically, we could have a million or ten million tokens in context, and perfect memory, but retrieval costs still exist. Finding truly relevant information at the moment of specific decision-making isn’t easy, even if you can store everything. So I believe there’s huge room for innovation in memory systems.

Gary Tan: Honestly, a million-token context window is already bigger than I expected and can do a lot.

Demis Hassabis: Yes, for most use cases, it’s enough. But think of the context window as roughly equivalent to working memory. Humans have an average of about seven items, while we have hundreds of thousands or even millions of tokens in our context window. The problem is, we stuff everything in—including unimportant or incorrect info, which is quite crude. And if you process real-time video streams and naively record all tokens, a million tokens only last about 20 minutes. But if you want the system to understand your life over a month or two, that’s still far from enough.

Gary Tan: DeepMind has always invested heavily in reinforcement learning and search. How deeply is this philosophy embedded in your current development of Gemini? Is RL still underestimated?

Demis Hassabis: It might indeed be underestimated. Attention to RL has fluctuated over time. From day one at DeepMind, we’ve been working on agent systems. All the work on Atari and AlphaGo essentially belongs to reinforcement learning agents—systems that can autonomously achieve goals, make decisions, and plan. Of course, we started in games because the complexity was manageable, then gradually moved to more complex games, like AlphaStar after AlphaGo, covering most of the games we could.

The next question is whether these models can generalize into world models or language models, not just game models. We’ve been working on this for years. Today’s leading models and chain-of-thought reasoning are essentially a re-derivation of what AlphaGo pioneered.

I think much of what we did back then is highly relevant today. We’re revisiting those old ideas, scaling them up, making them more general, including Monte Carlo tree search and other reinforcement learning methods. The ideas behind AlphaGo and AlphaZero are extremely related to foundational models today, and I believe much of the progress in the next few years will come from here.

Distillation and Small Models

Gary Tan: To be smarter now, you need bigger models, but at the same time, distillation techniques are improving, making small models quite fast. Your Flash models are very strong, reaching about 95% of the state-of-the-art, but at only a tenth of the cost. Is that right?

Demis Hassabis: I think that’s one of our core advantages. You need to build the largest models first to get cutting-edge capabilities. One of our biggest strengths is quickly distilling and compressing those capabilities into smaller models. We invented the distillation approach ourselves, and we’re still among the best in the world. Plus, we have strong business incentives to do so. We’re probably the largest AI application platform globally.

With AI Overviews, AI Mode, and Gemini, every Google product—Maps, YouTube, etc.—is integrating Gemini or related tech. This involves billions of users and products serving hundreds of millions or billions of users. They need to be extremely fast, efficient, low-cost, and low-latency. This drives us to optimize Flash and smaller Flash-Lite models to be highly efficient, ultimately serving various user needs.

Gary Tan: I’m curious how smart these small models can get. Is there an upper limit to distillation? Can 50B or 400B models be as smart as today’s largest frontier models?

Demis Hassabis: I don’t think we’ve hit the information-theoretic limit yet—at least, no one knows if we have. Maybe someday we’ll reach a density ceiling, but our current assumption is that after a Pro model is released, its capabilities can be compressed into a very small model that runs on edge devices within six months to a year.

You can see this in Gemma models; our Gemma 4 performs very strongly at similar sizes. This relies heavily on distillation and efficiency optimization techniques. So I really see no fundamental theoretical limit yet—I believe we’re still far from that.

Gary Tan: There’s a very astonishing phenomenon now: engineers are doing 500 to 1,000 times the work they did six months ago. Some people here are doing the equivalent of what a Google engineer in the 2000s did, but a thousand times more. Steve Yegge mentioned this.

Demis Hassabis: I find it exciting. Small models have many uses. One is cost and speed—faster iteration, better collaboration. Even if they’re not the absolute cutting-edge, say 90-95%, that’s enough

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pin