Microsoft has open-sourced its Lens series, a 3.8 billion parameter text-to-image foundational model, which boasts exceptional training efficiency and performance. The Lens model requires only 19.3% of the computational resources compared to Alibaba's Z-Image, thanks to dual optimizations in data and architecture. The training dataset, Lens-800M, includes 800 million image-text pairs generated by GPT-4.1, with an average prompt length of 109 words.
The Lens series features three weight variants for different deployment needs, with the Lens-Turbo variant achieving ultra-fast inference, generating 1024x1024 images in just 0.84 seconds. The model supports resolutions up to 1440x1440 and various aspect ratios. Microsoft has made the model weights available on Hugging Face under the MIT license, with inference code hosted on GitHub, facilitating access for developers and researchers.
Microsoft Open-Sources Lens, a 3.8B Parameter Text-to-Image Model
Avertissement : Le contenu proposé sur Phemex News est à titre informatif uniquement. Nous ne garantissons pas la qualité, l'exactitude ou l'exhaustivité des informations provenant d'articles tiers. Ce contenu ne constitue pas un conseil financier ou d'investissement. Nous vous recommandons vivement d'effectuer vos propres recherches et de consulter un conseiller financier qualifié avant toute décision d'investissement.
