GLM-5.1 Tops Open-Source Models in Coding Agent Benchmark

GLM-5.1 has emerged as the leading open-source model in the Artificial Analysis Coding Agent Benchmark, according to a report by Artificial Analysis. The benchmark evaluates model performance on three key tests: SWE-Bench-Pro-Hard-AA, Terminal-Bench v2, and SWE-Atlas-QnA, which simulate real-world programming and technical tasks. While the proprietary Opus 4.7 model secured the top global position, GLM-5.1, operating on Claude Code, led among open-source models, showcasing its advanced capabilities in programming agent scenarios.

Source: Afficher l'original

Avertissement : Le contenu proposé sur Phemex News est à titre informatif uniquement. Nous ne garantissons pas la qualité, l'exactitude ou l'exhaustivité des informations provenant d'articles tiers. Ce contenu ne constitue pas un conseil financier ou d'investissement. Nous vous recommandons vivement d'effectuer vos propres recherches et de consulter un conseiller financier qualifié avant toute décision d'investissement.

Vous pourriez aussi aimer