OpenAI shipped a major update to GPT-Rosalind, its model series purpose-built for life-sciences research, on June 3. The release fuses GPT-5.5's agentic coding and tool-use with sharper intelligence in the core drug-discovery domains of medicinal chemistry and genomics, and extends performance across broader analysis, design, and experimental workflows. The framing is deliberately enterprise: not a chat model that happens to know biology, but a research partner that plans analyses, runs tools, and preserves provenance across a long workflow.
The headline numbers are interesting as much for their modest absolute level as for the deltas, which is an honest signal that these benchmarks are hard and far from saturated. On MedChemBench, a new suite covering structure-activity relationships, potency, toxicity, absorption-distribution-metabolism-excretion prediction, multi-parameter lead optimization, and retrosynthesis, GPT-Rosalind scores 27.5 percent versus 25.1 percent for GPT-5.5 while using 7.2 percent fewer tokens. On GeneBench, an agentic, long-horizon genomics and quantitative-biology evaluation, it reaches 21.6 percent versus 20.4 percent while using 31 percent fewer tokens. On LabWorkBench, which links perturbations to outcomes in real and deliberately uncontaminated wet-lab protocols, it scores 63.2 percent versus 55.8 percent with 5.3 percent fewer tokens. OpenAI also introduced LifeSciBench, an externally expert-judged benchmark that takes an end-to-end view across six workflow areas rather than scoring a single capability in isolation.
On the product side, two plugins, Life Sciences Research and Life Sciences NGS Analysis, are now available to all users through Codex, bringing sourced evidence retrieval and bioinformatics execution into the same workspace, alongside interactive viewers for sequence, alignment, and structure file types. A walkthrough follows a scientist analyzing a liquid-tumor ctDNA biopsy, narrowing to a KRAS G12C alteration, then pulling target and resistance context and inspecting the inhibitor-bound pocket in-line. The model is available in research preview to eligible organizations through a trusted-access deployment structure that requires legitimate research, governance, and enterprise-grade security, and OpenAI named Novo Nordisk as an early partner scaling its medical research on the model.
Why it matters: a frontier lab is now putting a domain-specialized, tool-using model directly into regulated drug-discovery pipelines, with the token-efficiency gains that make long-horizon agentic runs economical. The biodefense framing, tied to OpenAI's Rosalind Biodefense effort, also signals that capability gating and trusted access are being treated as first-class deployment concerns rather than afterthoughts.