A Chinese AI enthusiast, APFrisco, successfully ran Moonshot AI's Kimi K2.5, a trillion-parameter model, on a single Nvidia RTX 3060 GPU paired with 768 GB of Intel Optane Persistent Memory. Despite the mid-range GPU, the setup achieved approximately four tokens per second, showcasing the potential of unconventional hardware configurations.
Kimi K2.5, a Mixture-of-Experts model, activates only 32 billion parameters per token, allowing it to run on consumer-grade hardware. The model's full size is about 630 GB, with quantized versions at 381 GB, necessitating the use of Optane memory due to its cost-effectiveness compared to traditional DRAM. This demonstration highlights the accessibility of advanced AI models, as Kimi K2.5 is open-weight, enabling enthusiasts to experiment with large-scale AI without enterprise infrastructure.
AI Enthusiast Runs 1 Trillion-Parameter Model on RTX 3060 with Optane Memory
Disclaimer: The content provided on Phemex News is for informational purposes only. We do not guarantee the quality, accuracy, or completeness of the information sourced from third-party articles. The content on this page does not constitute financial or investment advice. We strongly encourage you to conduct you own research and consult with a qualified financial advisor before making any investment decisions.
