Alibaba's PAI team has released the AgenticQwen model, a lightweight agent language model designed for industrial-grade tool invocation, now open-sourced in 8B and 30B-A3B versions. Utilizing a novel "dual data flywheel" reinforcement learning framework, the model achieves capabilities similar to large models with reduced inference costs. The dual flywheel approach enhances performance by generating complex decision-making scenarios and improving from model errors.
AgenticQwen-8B scores an average of 47.4 on benchmarks like TAU-2 and BFCL-V4, outperforming the base Qwen3-8B and approaching the Qwen3-235B model. The 30B-A3B version, activating only 3B parameters, scores 50.2. Despite its success, the model's 40K context length limitation poses challenges in deep search tasks. The model is already in use within Alibaba's internal systems, offering improved performance with shorter inference times.
Alibaba's PAI Team Open-Sources AgenticQwen Model with Dual Data Flywheel
Disclaimer: The content provided on Phemex News is for informational purposes only. We do not guarantee the quality, accuracy, or completeness of the information sourced from third-party articles. The content on this page does not constitute financial or investment advice. We strongly encourage you to conduct you own research and consult with a qualified financial advisor before making any investment decisions.
