Alibaba's Qwen team has introduced Qwen-Image-Bench, an open-source benchmark designed to evaluate the text-to-image capabilities of large models. Accompanying this release is Q-Judger, a visual judge model trained on Qwen3.6-27B, which assesses models across five dimensions: image quality, aesthetics, text-image alignment, real-world fidelity, and creative generation. The benchmark includes 1,000 bilingual prompts and evaluates models on 56 detailed metrics.
Initial evaluations show GPT Image 2 leading with a composite score of 64.69, excelling in all five categories. Other top performers include Nano Banana 2.0 and GPT Image 1.5. Alibaba's Qwen Image 2.0 Pro ranks fifth. The evaluation highlights common challenges in AI image generation, such as difficulties with human hand anatomy and physical laws representation.
Alibaba Launches Qwen-Image-Bench for Evaluating Text-to-Image Models
Disclaimer: The content provided on Phemex News is for informational purposes only. We do not guarantee the quality, accuracy, or completeness of the information sourced from third-party articles. The content on this page does not constitute financial or investment advice. We strongly encourage you to conduct you own research and consult with a qualified financial advisor before making any investment decisions.
