Cerebras Systems has achieved a significant milestone by serving Moonshot AI's Kimi K2.6 model at 981 output tokens per second, outperforming the next-best GPU cloud provider by 6.7 times. This performance was independently verified by Artificial Analysis. The Kimi K2.6, a 1-trillion-parameter Mixture-of-Experts model, was released on April 20, 2026, and features multimodal and agentic capabilities.
The Cerebras-powered setup demonstrated a 29x improvement in end-to-end latency on a representative coding workload compared to the official Kimi endpoint. This achievement underscores the capabilities of Cerebras's Wafer-Scale Engine, which offers over 200 times the bandwidth of NVIDIA's NVLink. Following its IPO in May 2026, valued at $95 billion, Cerebras is proving its hardware's ability to efficiently handle large AI models.
Cerebras Systems Outpaces GPU Cloud with 981 Tokens/Second on Kimi K2.6 Model
Aviso legal: El contenido de Phemex News es únicamente informativo.No garantizamos la calidad, precisión ni integridad de la información procedente de artículos de terceros.El contenido de esta página no constituye asesoramiento financiero ni de inversión.Le recomendamos encarecidamente que realice su propia investigación y consulte con un asesor financiero cualificado antes de tomar cualquier decisión de inversión.
