Today’s most important event is NVIDIA GTC Conference, which is basically an AI version of A Short History of Humanity.

robot
Abstract generation in progress

The most important thing today is NVIDIA’s GTC conference—basically an AI version of A Brief History of Humankind.

Huang Renxun hasn’t even stepped onto the stage yet, but the information leaked in advance is already enough to fill a book.

Wanwan has pulled together three key takeaways—come on, friends, follow me.

  1. AI compute costs directly cut to one-tenth

The previous generation Blackwell was already impressive, right? Next up, the new-generation chip Vera Rubin will be ready for mass production.

What’s so powerful about Vera Rubin? To put it plainly, it comes down to two words: cheap.

Running the same AI model:
the number of chips drops to a quarter, and inference compute costs fall by 90%.
A 90% cut, friends.
AWS, Microsoft, and Google—the three biggest cloud providers—are all jumping on board with the first batch.

  1. Groq, acquired for $20 billion last year—handing in the assignment today

Previously, at an earnings call, Huang Renxun said that Groq would be connected into NVIDIA’s ecosystem as an extension architecture—just like how NVIDIA acquired Mellanox back then to round out its networking capabilities.

Groq’s LPU and NVIDIA’s GPU sit in the same data center: GPUs understand the problem, while LPUs rapidly spit out the answers.

With the two chips working in division of labor, in Agent scenarios latency drops directly.

AI agents do the work for people. One task may require dozens of model adjustments back and forth. Each round is burning inference compute, and meanwhile the user is waiting—if it’s even a bit slower, the experience falls apart.

Inference happens in two steps: first, understand your question; then output the answer word by word.

GPUs are strong at the first step, but for the speed and stability of outputting words in the second step, Groq’s LPU is stronger.

Is $20 billion expensive?

Think about it: in the future, every company will run hundreds of agents, and each agent will adjust models thousands of times a day.

  1. NVIDIA’s OpenClaw version launches—called NemoClaw

It’s a fully open-source platform: once enterprises install it, they can deploy AI employees to run workflows for humans, handle data, and manage projects.
It’s said to already be in talks with Salesforce and Adobe.

What’s interesting is that NemoClaw doesn’t require you to use NVIDIA chips.
Think about this logic.
Selling chips earns only the hardware layer of money; making the rules is what earns you across the whole chain. Huang Renxun clearly has this figured out.

  1. Huang Renxun says he’ll showcase “a chip the world has never seen before”

Most likely, it’ll be the next-next-generation architecture Feynman making its first appearance, with mass production in 2028 using TSMC’s most advanced 1.6nm process.

And there’s one more niche item I think is pretty interesting.

NVIDIA has released laptop computer processors—two models—focused on gaming.
The sellers of graphics cards are coming to snatch the CPU market’s meal.

Wanwan, I feel like Huang Renxun is going to become a great figure of an era in the future.

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pin