Nvidia just dropped a heavyweight contender in the agentic AI race. The company's new Nemotron 3 Super - a 120-billion-parameter open model with 12 billion active parameters - delivers five times higher throughput for autonomous AI systems, marking a significant leap in efficiency for enterprises running complex agent workflows. The launch signals Nvidia's aggressive push into the exploding agentic AI market, where autonomous systems are rapidly replacing traditional scripted automation across industries.
Nvidia is making its biggest bet yet on agentic AI. The chipmaker launched Nemotron 3 Super today, an open-source large language model engineered specifically to power autonomous agent systems at enterprise scale. The numbers tell the story - five times higher throughput compared to previous generation models, achieved through a sophisticated mixture-of-experts architecture that activates just 12 billion parameters out of a total 120 billion.
The timing couldn't be more strategic. As enterprises pivot from simple chatbots to complex multi-agent systems that can reason, plan, and execute tasks autonomously, infrastructure bottlenecks have become the limiting factor. Nvidia's approach tackles this head-on by optimizing for the specific workload patterns of agentic AI - long reasoning chains, tool use, and multi-step task execution that traditional models struggle to handle efficiently.
Perplexity, the AI-powered search startup, is the first partner out of the gate. The company is offering its users access to Nemotron 3 Super for complex research and information synthesis tasks where autonomous agents need to orchestrate multiple queries and synthesize results. It's a smart proving ground - search demands exactly the kind of multi-step reasoning and accuracy that agentic systems promise.
What makes Nemotron 3 Super different comes down to architecture. The mixture-of-experts design means only a fraction of the model's parameters activate for any given task, dramatically reducing computational overhead while maintaining the reasoning depth of a much larger model. For enterprises running hundreds or thousands of autonomous agents simultaneously, that efficiency translates directly to cost savings and faster response times.











