Thinking Machines' TML-Interaction-Small model has tied with OpenAI's GPT-Realtime-2 (xHigh) for the top spot on Scale Labs' Audio MC S2S leaderboard, achieving an APR score of 43.4%. Despite GPT-Realtime-2 (xHigh) having a slightly higher absolute score of 48.45 compared to TML-Interaction-Small's 43.36, the difference falls within statistical error margins, leading to both models being ranked as co-first. The leaderboard's second tier features the standard GPT-Realtime-2 with a score of 37.61, followed by Gemini 3.1 Flash Live with thinking mode enabled at 36.06, and the older GPT-Realtime-1.5. Scale Labs highlighted the TML-Interaction-Small model's rare long-context awareness and fast conversational response times among full-duplex models.