Claude Mythos Preview Excels in AI Cybersecurity Simulation

The UK AI Safety Institute (AISI) has announced that Claude Mythos Preview achieved a 73% success rate in expert-level Capture The Flag (CTF) cybersecurity tasks, a feat no AI model had accomplished before April 2025. Additionally, Mythos Preview became the first AI to fully complete "The Last Ones" (TLO), a 32-step simulated enterprise network attack scenario, in 3 out of 10 tests. On average, it completed 22 steps across all attempts, outperforming Claude Opus 4.6, which averaged 16 steps. AISI conducted these tests under controlled conditions, emphasizing that the environment lacked active defenders and defensive tools, and did not penalize security alerts. This setup differs from real-world networks, and thus, Mythos Preview's ability to breach well-protected systems remains unverified. AISI highlighted the need for enhanced security evaluation methodologies, planning future tests in environments with active defense and real-time response.

Source: Show Original

Disclaimer: The content provided on Phemex News is for informational purposes only. We do not guarantee the quality, accuracy, or completeness of the information sourced from third-party articles. The content on this page does not constitute financial or investment advice. We strongly encourage you to conduct you own research and consult with a qualified financial advisor before making any investment decisions.