OpenAI has introduced LifeSciBench, a new benchmark aimed at evaluating AI systems' capabilities in real-world scientific research scenarios. The benchmark comprises 750 tasks across seven research workflow categories and seven biology domains, crafted by 173 researchers with doctoral degrees in biotech or pharmaceuticals. LifeSciBench focuses on complex scientific capabilities such as evidence integration, experimental design, and scientific reasoning, with over 79% of tasks requiring multi-step reasoning and an average of four reasoning steps per question. The benchmark includes 1,062 real research-related data attachments, enhancing its practical relevance.