The newly unveiled STEP3-VL-10B model from StepFun AI is redefining the capabilities of sub-10 billion parameter models. Despite its relatively small size, the model delivers performance on par with much larger counterparts like GLM-4.6V-106B and Qwen3-VL-235B. It achieved impressive benchmark scores, including 94.43% on AIME2025 for mathematical reasoning and 80.11% on MMMU for expert multimodal understanding.
STEP3-VL-10B incorporates a novel technique known as Parallel Coordinated Reasoning (PaCoRe) to enhance test-time computation. The model's development involved a rigorous post-training process with over 1,000 iterations of Reinforcement Learning. It was trained on 1.2 trillion multimodal tokens, emphasizing critical areas such as K-12 education, OCR, GUI, and mathematical reasoning.
STEP3-VL-10B Model Challenges Sub-10B Efficiency Limits
Disclaimer: The content provided on Phemex News is for informational purposes only. We do not guarantee the quality, accuracy, or completeness of the information sourced from third-party articles. The content on this page does not constitute financial or investment advice. We strongly encourage you to conduct you own research and consult with a qualified financial advisor before making any investment decisions.
