NVIDIA has launched the Nemotron 3 Nano Omni, an open-source multimodal model, significantly enhancing inference efficiency. Built on a 30B-A3B Mixture of Experts architecture, the model supports a 256K context length and processes video, audio, images, and text inputs uniformly. It achieves up to 9x higher throughput compared to similar models, reducing inference costs and improving scalability. The Nemotron 3 Nano Omni is now accessible on platforms like Hugging Face, OpenRouter, and NVIDIA NIM. Companies such as Aible, Applied Scientific Intelligence, and H Company have already adopted the model, highlighting its potential impact on the industry.