AI models, including GPT-5.5, have struggled to meet the demands of Vals AI's new Finance Agent v2 benchmark, which simulates the workflow of junior financial analysts. The test, comprising 927 expert-reviewed questions, saw GPT-5.5 achieve a top accuracy of just 51.76%, slightly ahead of Claude Opus 4.7 and Claude Sonnet 4.6. The benchmark requires models to autonomously locate relevant information within extensive financial reports and perform complex calculations, highlighting the challenges AI faces in high-precision financial analysis. Despite improvements in basic retrieval tasks, the results indicate that AI is still far from replacing human analysts in finance. Under strict scoring standards, all leading models scored below 40%, with the most challenging categories yielding scores as low as 23%. The test underscores the need for further advancements in AI to meet the rigorous demands of financial analysis.