A Chinese AI enthusiast, APFrisco, successfully ran Moonshot AI's Kimi K2.5, a trillion-parameter model, on a single Nvidia RTX 3060 GPU paired with 768 GB of Intel Optane Persistent Memory. Despite the mid-range GPU, the setup achieved approximately four tokens per second, showcasing the potential of unconventional hardware configurations. Kimi K2.5, a Mixture-of-Experts model, activates only 32 billion parameters per token, allowing it to run on consumer-grade hardware. The model's full size is about 630 GB, with quantized versions at 381 GB, necessitating the use of Optane memory due to its cost-effectiveness compared to traditional DRAM. This demonstration highlights the accessibility of advanced AI models, as Kimi K2.5 is open-weight, enabling enthusiasts to experiment with large-scale AI without enterprise infrastructure.