Lun Wang, a former researcher at Google DeepMind, has sparked debate in the AI community by asserting that the industry's primary bottleneck is not computational power, data, or energy, but rather the evaluation system itself. In a detailed blog post published on May 17, 2026, Wang argues that current evaluation methods fail to predict when AI models will develop new capabilities, citing historical examples of emergent capabilities and grokking as evidence.
Wang's critique centers on the assumption that AI models are merely enhanced versions of their predecessors, which he claims undermines the industry's ability to foresee significant shifts in AI capabilities. He warns that without accurate evaluation metrics, the AI industry risks training models to solve the wrong problems, potentially leading to unforeseen failure modes. Wang's insights challenge the industry's current focus on scaling and highlight the need for a more robust evaluation framework to guide future AI development.
Former DeepMind Researcher Highlights Evaluation as AI's Core Bottleneck
Avertissement : Le contenu proposé sur Phemex News est à titre informatif uniquement. Nous ne garantissons pas la qualité, l'exactitude ou l'exhaustivité des informations provenant d'articles tiers. Ce contenu ne constitue pas un conseil financier ou d'investissement. Nous vous recommandons vivement d'effectuer vos propres recherches et de consulter un conseiller financier qualifié avant toute décision d'investissement.
