BridgeMind AI's Claims of Claude Opus 4.6 Downgrade Face Criticism

BridgeMind AI's viral claim that Anthropic's Claude Opus 4.6 was secretly downgraded has sparked controversy. The post alleged a significant drop in the model's performance on the BridgeBench hallucination benchmark, with accuracy falling from 83.3% to 68.3%. However, critics, including computer scientist Paul Calcraft, have dismissed the claim as flawed, noting that the retest involved a different set of tasks, and the performance on overlapping tasks showed only a minor variance. The debate highlights broader frustrations with AI models' perceived quality decline. Since its launch, Claude Opus 4.6 has faced complaints about reduced reasoning depth and shorter responses, partly due to Anthropic's adaptive thinking controls. These changes prioritize efficiency over depth, affecting developers who rely on consistent performance. Despite the controversy, Anthropic has not commented on the specific claims as of April 13.

Source: Show Original

Disclaimer: The content provided on Phemex News is for informational purposes only. We do not guarantee the quality, accuracy, or completeness of the information sourced from third-party articles. The content on this page does not constitute financial or investment advice. We strongly encourage you to conduct you own research and consult with a qualified financial advisor before making any investment decisions.