Thinking Machines' TML-Interaction-Small model has tied with OpenAI's GPT-Realtime-2 (xHigh) for the top spot on Scale Labs' Audio MC S2S leaderboard, achieving an APR score of 43.4%. Despite GPT-Realtime-2 (xHigh) having a slightly higher absolute score of 48.45 compared to TML-Interaction-Small's 43.36, the difference falls within statistical error margins, leading to both models being ranked as co-first.
The leaderboard's second tier features the standard GPT-Realtime-2 with a score of 37.61, followed by Gemini 3.1 Flash Live with thinking mode enabled at 36.06, and the older GPT-Realtime-1.5. Scale Labs highlighted the TML-Interaction-Small model's rare long-context awareness and fast conversational response times among full-duplex models.
Thinking Machines Model Matches GPT-Realtime-2 in Audio Benchmark
Disclaimer: The content provided on Phemex News is for informational purposes only. We do not guarantee the quality, accuracy, or completeness of the information sourced from third-party articles. The content on this page does not constitute financial or investment advice. We strongly encourage you to conduct you own research and consult with a qualified financial advisor before making any investment decisions.
