DeepSeek V4 and Meituan LongCat 2.0 Break Trillion-Parameter

Chinese AI companies are making significant strides in AI development, with DeepSeek V4 and Meituan LongCat 2.0 both surpassing the trillion-parameter mark. Released in late April, these models support ultra-long contexts of up to 1 million tokens. DeepSeek V4 has successfully transitioned from NVIDIA's CUDA to Huawei's Ascend platform, reducing inference costs through innovative hybrid attention architecture. Meanwhile, Meituan's LongCat 2.0 is entirely trained on domestic computing power, utilizing 50,000 to 60,000 local chips. These advancements highlight China's growing capabilities in AI, as domestic companies increasingly rely on homegrown solutions. The development of these models underscores the engineering challenges faced, such as optimizing memory usage and ensuring stability in large-scale GPU clusters. As China continues to enhance its AI infrastructure, these efforts are paving the way for more robust and efficient AI systems, contributing to the country's broader technological ambitions.