Inside the NVIDIA GTC 2025 Keynote – the Superbowl of AI
The energy and anticipation were evident even before entering the arena, and attendees had lined up from 6 AM for the 10 AM keynote. At 8 AM, I was able to find a spot in the line a few miles away from the arena, it felt like I was going to a rock concert. Eventually, we were able to enter the arena to find seats that were close enough to the NVIDIA GTC25 keynote stage. Jensen Huang took the stage right on time and quickly captured the attention of every tech enthusiast in attendance.
He laid out NVIDIA’s latest AI advancements, clarifying critical technologies shaping enterprise strategy and infrastructure decisions.
In this blog, I’ll share my key insights from the event, focusing on what matters most to business and technology leaders today.
Understanding tokens as the language and currency of AI
The keynote started with an AV explaining, “How do tokens help us explore the universe?” It reframed my thinking about AI’s foundation.
Tokens are small data segments created by splitting more significant inputs into manageable parts. Typically, one token equates to around four characters of English text or a 512×512 pixel section of an image.
Token is the language of AI: It helps transform Images into scientific data and convert raw data into foresight. It helps us decode the laws of physics and see a disease before it takes hold.
A token is the currency of AI: Going forward, we will measure an AI Infrastructure’s capacity by its token throughput, i.e., the number of tokens it can generate and process.
Now that we understand the role of tokens let’s explore how enterprises scale the infrastructure that powers them.
Reimagining data centers as AI factories
This concept of tokens led to an interesting term frequently used in the keynote: “The AI Factories.” It refers to AI data centers that generate tokens and accelerate AI workloads. It helps convert tokens from the language of AI to the currency of AI.
The faster tokens can be processed, the quicker AI models can learn and respond, resulting in more opportunities to generate revenue. With this foundational shift, Jensen also discussed how the industry is racing ahead through the next phase of AI maturity.
From generative AI to agentic AI to physical AI
Jensen gave us a glimpse of where we are in the AI journey. He discussed the last five years of generative AI and the previous two years of agentic AI as “in the past.” This highlighted the pace at which AI innovation happens. However, it also highlighted the gap between the high-tech companies using enterprise AI and the others.
Most are still catching-up with generative AI and agentic AI. In contrast, high-tech companies are moving fast from generative AI (past) to agentic AI (present) to physical AI (future). With this evolution, the need for more accurate, reasoning-enabled responses has become even more critical.
Why reasoning in AI requires more power and precision
Jensen then explained how these innovations directly impact the “Currency of AI” and that the computational requirement has increased by 100x compared to the previous year. He further demonstrated why simple instant GenAI responses that typically used ~400 tokens but were frequently incorrect have been replaced with reasoning-enabled responses that use ~20x tokens but are highly accurate.
In other words, the original GenAI, which hallucinated frequently, has given way to the new-age GenAI, which is more intelligent and reliable. It breaks down the query and responses into multiple steps and reasons every step to develop an accurate answer. Naturally, this growing need for accuracy and speed demands more substantial infrastructure — here’s where NVIDIA’s GPU roadmap comes in.
Scaling up with Blackwell, Vera Rubin, and Rubin Ultra
Returning to “The Data Centers,” my apologies “The AI Factories.” Jensen discussed why scale-up is the preferred method over scale-out also laid out his 3-year GPU roadmap, so we can better plan our budgets for scale-up.
A 100-megawatt AI Factory with H100 NVL8 with 45k GPUs generates 300M tokens. However, GB200 NVL72 with 85k GPUs generates 12,000M tokens, i.e., nearly 40x more tokens, which makes his point about why scale-up is necessary. As per Jensen, “The more you buy, the more you make” (more revenue by generating more tokens).”
Second half of 2025 (currently in full production)
Grace Blackwell Ultra (GB300) NVL72 with 1.1 EF FP4 Inference (1.5x computational power) and 20 TB HBM | 40 TB Fast Memory (1.5x memory) as compared to GB200.
Second half of 2026
Vera Rubin NVL144 with 3.6 FP4 Inference (3.3x computational power) and 75 TB Fast Memory (1.6x memory) compared to GB300 NVL72.
Second half of 2027
Rubin Ultra NVL576 has 15 EF FP4 Inference (14x computational power) and 365 TB Fast Memory (8x memory) compared to GB300 NVL72.
But as the infrastructure expands, so does the complexity of managing it. NVIDIA addressed this challenge head-on.
Introducing NVIDIA Dynamo for large-scale AI infrastructure
Considering the exponential growth in AI infrastructure, better management is urgently needed. NVIDIA has launched its AI operating system Dynamo, like VMware for computing, to address this area.
Dynamo enables seamless scaling of inference workloads across large GPU fleets with dynamic resource scheduling, intelligent request routing, optimized memory management, and accelerated data transfer.
While Dynamo ensures the AI factory runs efficiently, Omniverse and Cosmos help simulate it before it’s even built.
Designing AI factories with Omniverse and Cosmos
You must make digital twins before you build an AI factory. Omniverse and Cosmos lets you simulate and design your AI factory before using environment simulation.
It also accelerates development in autonomous vehicles using new development methods, such as model distillation, closed-loop training, and synthetic data generation, drastically reducing training and learning time. Environment simulation improves safety.
Jensen also announced that General Motors has selected NVIDIA to partner in building its future self-driving cars. Then, Jensen shifted gears and offered a glimpse into how AI could eliminate the need for software as we know it.
Imagining a future without software
Jensen also discussed how the world is shifting from general-purpose retrieval-based computing using handwritten software to generative AI-based computing with machine learning software. He sees a future where computers are the token generators and generate content instead of humans using software to generate content.
I am also curious to know how it will disrupt the way we work. Still, I visualize myself talking to an AI to generate my presentations, perform analysis, create spreadsheets, and get things done for me without using any software or tools. Does it sound like a science fiction movie? It is fiction for now, but not impossible.
Wouldn’t it be interesting if we didn’t have to use any software and instead, we interact with AI to generate content? I am in awe of his vision, even though I am unsure if my interpretation is what he meant. I would love to hear your interpretation of Jensen’s “generative AI-based computing.”
To enable this future across all businesses, NVIDIA introduced solutions designed to democratize access to AI infrastructure.
Democratizing AI with DGX Spark and DGX Station
To commoditize AI and data science, NVIDIA announced two products:
A fit-on-the-palm sized DGX Spark to replace the existing 3U form factor 134 lbs DGX-1 system. DGX Spark will be 30-40 times cheaper than DGX-1 while maintaining the specifications with 20 CPU Cores, 128 GB GPU Memory, and 1 PF Computation. It’s the perfect “Christmas Gift,” as per Jensen.
And a DGX Station with a GB300 Superchip, 72 CPU cores, 784 GB System Memory, and 20 PF Computation.
Beyond infrastructure, NVIDIA also discussed how AI is expanding far beyond cloud environments.
Expanding AI beyond the cloud and into the edge
Some other interesting areas that Jensen talked about which are not covered in this blog:
- CUDA-X libraries for every industry to accelerate AI computing.
- NVIDIA Halos Chip-to-Deployment AV Safety System.
- AI started in the cloud since the CSPs have the AI infrastructure. However, it will not be limited to the cloud and its services and has already expanded to other areas, including edge.
- USD 100 billion of the world’s capital investment is in radio networks; hence, Cisco, NVIDIA, T-Mobile and others are working on a radio network to put AI onto the edge. 6G will be something like 1 TB/sec speed with latency < 1 ms. which in turn will drive massive growth in AI on the edge.
- Spectrum-X is the supercharged ethernet to scale up the GPU network.
- Silicon photonics co-packaged optics networking switches to scale AI factories to millions of GPUs.
While AI moves closer to the edge, it’s also becoming more physical, with real-world applications in robotics.
The rise of physical AI and enterprise robotics
The last topic was “Physical AI and Robotics” and how robots can help with the worker shortage problem. Honestly, I don’t 100% understand this problem since the place I come from does not necessarily have this problem.
However, I see a bright future for it in our daily lives. Physical AI will give rise to the age of robots, and NVIDIA can help train them with NVIDIA Omniverse, Cosmos, and Isaac Lab. NVIDIA also introduced their Isaac GR00Tt N1 generalist foundation model for humanoid robots inspired by principles of human cognitive processing.
I loved the way Jensen ended the keynote by introducing the Star Wars-inspired Blue robot (minus the technical glitch) built using Newton, an open-source, extensible physics engine jointly developed by NVIDIA, DeepMind, and Disney Research to support advancements in robotic learning and simulation. It gave everyone a much-needed smile after a mind-boggling, highly technical, number-crunching keynote!
My interaction with the general-purpose Isaac GR00T N1 humanoid during #GTC25
- Tokens are the language and currency of AI, and the more you generate and process, the more revenue opportunities you will have.
- The computational requirement for AI has increased by 100x, and reasoning AI uses ~20x more tokens than last year to make GenAI highly accurate and fast.
- Scale up AI factories with faster GPUs. Blackwell GPU is in full production, and Vera Rubin and Rubin Ultra are planned for 2026 and 2027, respectively.
- Better manage AI Infrastructure with NVIDIA Dynamo, the operating system of AI.
- DGX Spark and DGX Station to commoditize AI and data science.
- NVIDIA Omniverse and Cosmos help you with real-world simulation and, together with Isaac’s lab, will help you build robots.
- GenAI and agentic AI are right now, and Physical AI is the Future!
Latest Blogs
Leveraging the right cloud technology with appropriate strategies can lead to significant cost…
Introduction The financial industry drives the global economy, but its exposure to risks has…
On January 17, 2024, the Centers for Medicare and Medicaid Services (CMS) released the Interoperability…
A tectonic shift in wealth is underway, and agility is the key factor that will distinguish…