NVIDIA Unveils Nemotron 3 Nano Omni, Enhancing Inference Efficiency by 9x

NVIDIA has launched the Nemotron 3 Nano Omni, an open-source multimodal model, significantly enhancing inference efficiency. Built on a 30B-A3B Mixture of Experts architecture, the model supports a 256K context length and processes video, audio, images, and text inputs uniformly. It achieves up to 9x higher throughput compared to similar models, reducing inference costs and improving scalability. The Nemotron 3 Nano Omni is now accessible on platforms like Hugging Face, OpenRouter, and NVIDIA NIM. Companies such as Aible, Applied Scientific Intelligence, and H Company have already adopted the model, highlighting its potential impact on the industry.

Source: Show Original

Disclaimer: The content provided on Phemex News is for informational purposes only. We do not guarantee the quality, accuracy, or completeness of the information sourced from third-party articles. The content on this page does not constitute financial or investment advice. We strongly encourage you to conduct you own research and consult with a qualified financial advisor before making any investment decisions.