Microsoft has open-sourced its Lens series, a 3.8 billion parameter text-to-image foundational model, which boasts exceptional training efficiency and performance. The Lens model requires only 19.3% of the computational resources compared to Alibaba's Z-Image, thanks to dual optimizations in data and architecture. The training dataset, Lens-800M, includes 800 million image-text pairs generated by GPT-4.1, with an average prompt length of 109 words.
The Lens series features three weight variants for different deployment needs, with the Lens-Turbo variant achieving ultra-fast inference, generating 1024x1024 images in just 0.84 seconds. The model supports resolutions up to 1440x1440 and various aspect ratios. Microsoft has made the model weights available on Hugging Face under the MIT license, with inference code hosted on GitHub, facilitating access for developers and researchers.
Microsoft Open-Sources Lens, a 3.8B Parameter Text-to-Image Model
Disclaimer: The content provided on Phemex News is for informational purposes only. We do not guarantee the quality, accuracy, or completeness of the information sourced from third-party articles. The content on this page does not constitute financial or investment advice. We strongly encourage you to conduct you own research and consult with a qualified financial advisor before making any investment decisions.
