DeepMind AI Co-Mathematician Outperforms GPT-5.5 Pro

Google DeepMind has unveiled AI Co-Mathematician, a groundbreaking interactive research workspace that has set a new benchmark in solving complex mathematical problems. The system achieved a 47.9% accuracy rate on the challenging FrontierMath Tier 4 benchmark, solving 23 out of 48 problems and surpassing the previous record of 39.6% held by GPT-5.5 Pro. Unlike its predecessors, AI Co-Mathematician utilizes a multi-agent framework rather than a next-generation foundation model, relying on Gemini 3.1 Pro to coordinate tasks among specialized agents. The system's architecture includes a "Project Coordinator" that delegates tasks to agents focused on literature retrieval, code generation, and reasoning, with all proofs reviewed by a panel of "reviewer agents." This collaborative approach has enabled the system to solve problems previously unsolvable by existing models. Notably, it assisted mathematician Marc Lackenby in resolving a long-standing conjecture from the Kourovka Notebook. Currently, AI Co-Mathematician is in limited internal testing, available to a select group of mathematicians.