Lun Wang, a former researcher at Google DeepMind, has sparked debate in the AI community by asserting that the industry's primary bottleneck is not computational power, data, or energy, but rather the evaluation system itself. In a detailed blog post published on May 17, 2026, Wang argues that current evaluation methods fail to predict when AI models will develop new capabilities, citing historical examples of emergent capabilities and grokking as evidence.
Wang's critique centers on the assumption that AI models are merely enhanced versions of their predecessors, which he claims undermines the industry's ability to foresee significant shifts in AI capabilities. He warns that without accurate evaluation metrics, the AI industry risks training models to solve the wrong problems, potentially leading to unforeseen failure modes. Wang's insights challenge the industry's current focus on scaling and highlight the need for a more robust evaluation framework to guide future AI development.
Former DeepMind Researcher Highlights Evaluation as AI's Core Bottleneck
Отказ от ответственности: Контент, представленный на сайте Phemex News, предназначен исключительно для информационных целей.Мы не гарантируем качество, точность и полноту информации, полученной из статей третьих лиц.Содержание этой страницы не является финансовым или инвестиционным советом.Мы настоятельно рекомендуем вам провести собственное исследование и проконсультироваться с квалифицированным финансовым консультантом, прежде чем принимать какие-либо инвестиционные решения.
