Microsoft Research and Zhejiang University have launched World-R1, a novel method that enhances 3D geometric consistency in text-to-video models using reinforcement learning. This approach does not require changes to model architecture or 3D datasets. World-R1 reconstructs 3D Gaussians from generated videos using the Depth Anything 3 model, rendering scenes from new angles and comparing them to the original. The reinforcement learning algorithm Flow-GRPO is used to adjust the video model based on reconstruction error, trajectory deviation, and semantic plausibility. The method employs the open-source Wan 2.1 model, with World-R1-Small and World-R1-Large versions showing significant improvements in 3D consistency metrics. Specifically, the Large model improves PSNR by 7.91 dB, while the Small version sees a 10.23 dB increase. In blind tests, World-R1 achieved a 92% win rate for geometric consistency. The project is open-sourced on GitHub under the CC BY-NC-SA 4.0 license.