Grok 4.20 Beta has achieved a 97% accuracy rate on the τ²-Bench evaluation, securing the second position. The τ²-Bench, an extension of the original τ-bench framework from Sierra, is renowned for its rigorous testing standards. This benchmark evaluates AI capabilities in answering questions and completing navigation tasks, highlighting Grok 4.20 Beta's advanced performance in these areas.