Breaking News
NVIDIA has introduced Nemotron 3 Super, a new open artificial intelligence model designed to power complex agent-based AI applications and large-scale enterprise workflows.
The model features 120 billion parameters with 12 billion active parameters during inference and is built to support advanced reasoning and task execution for autonomous AI agents. According to NVIDIA, the system is designed to help companies move beyond basic chatbots and deploy multi-agent AI applications capable of performing complex tasks.
The model includes a 1-million-token context window, allowing AI agents to retain large volumes of workflow information during long-running tasks and reducing the risk of “goal drift,” where AI systems lose track of their original objective.
Nemotron 3 Super is already being adopted by several AI companies and enterprise software providers. Search platform Perplexity is using the model as part of its search capabilities and multi-model orchestration system. AI coding platforms such as CodeRabbit, Factory and Greptile are integrating it into development agents to improve accuracy and reduce operational costs.
In the enterprise software sector, companies including Amdocs, Palantir, Cadence Design Systems, Dassault Systèmes and Siemens are deploying or customizing the model to automate workflows in sectors such as telecommunications, cybersecurity, semiconductor design and manufacturing.
Nemotron 3 Super uses a hybrid mixture-of-experts architecture that activates only a small portion of the model’s parameters during inference, improving efficiency while maintaining performance. NVIDIA said this design can deliver up to five times higher throughput and up to twice the accuracy compared with its previous Nemotron Super model.
The system also incorporates technologies such as multi-token prediction, which allows the model to generate several words simultaneously to speed up inference.
When deployed on NVIDIA’s NVIDIA Blackwell platform, the model operates in NVFP4 precision, reducing memory requirements and delivering inference speeds up to four times faster than previous architectures.
NVIDIA is releasing Nemotron 3 Super with open weights under a permissive license, enabling developers to deploy and customize the model across cloud platforms, data centers and local systems. The company also published the training methodology used to build the model, including more than 10 trillion tokens of training data and reinforcement learning environments.
The model can be deployed through several major cloud providers, including Google Cloud, Oracle and upcoming support on Amazon Web Services and Microsoft Azure.
Nemotron 3 Super is available through NVIDIA’s developer platform and partner ecosystems including Hugging Face and OpenRouter, where developers and enterprises can integrate it into AI applications and multi-agent systems.
The model features 120 billion parameters with 12 billion active parameters during inference and is built to support advanced reasoning and task execution for autonomous AI agents. According to NVIDIA, the system is designed to help companies move beyond basic chatbots and deploy multi-agent AI applications capable of performing complex tasks.
The model includes a 1-million-token context window, allowing AI agents to retain large volumes of workflow information during long-running tasks and reducing the risk of “goal drift,” where AI systems lose track of their original objective.
Nemotron 3 Super is already being adopted by several AI companies and enterprise software providers. Search platform Perplexity is using the model as part of its search capabilities and multi-model orchestration system. AI coding platforms such as CodeRabbit, Factory and Greptile are integrating it into development agents to improve accuracy and reduce operational costs.
In the enterprise software sector, companies including Amdocs, Palantir, Cadence Design Systems, Dassault Systèmes and Siemens are deploying or customizing the model to automate workflows in sectors such as telecommunications, cybersecurity, semiconductor design and manufacturing.
Nemotron 3 Super uses a hybrid mixture-of-experts architecture that activates only a small portion of the model’s parameters during inference, improving efficiency while maintaining performance. NVIDIA said this design can deliver up to five times higher throughput and up to twice the accuracy compared with its previous Nemotron Super model.
The system also incorporates technologies such as multi-token prediction, which allows the model to generate several words simultaneously to speed up inference.
When deployed on NVIDIA’s NVIDIA Blackwell platform, the model operates in NVFP4 precision, reducing memory requirements and delivering inference speeds up to four times faster than previous architectures.
NVIDIA is releasing Nemotron 3 Super with open weights under a permissive license, enabling developers to deploy and customize the model across cloud platforms, data centers and local systems. The company also published the training methodology used to build the model, including more than 10 trillion tokens of training data and reinforcement learning environments.
The model can be deployed through several major cloud providers, including Google Cloud, Oracle and upcoming support on Amazon Web Services and Microsoft Azure.
Nemotron 3 Super is available through NVIDIA’s developer platform and partner ecosystems including Hugging Face and OpenRouter, where developers and enterprises can integrate it into AI applications and multi-agent systems.
See What’s Next in Tech With the Fast Forward Newsletter
Tweets From @varindiamag
Nothing to see here - yet
When they Tweet, their Tweets will show up here.



