Inside the OpenAI Safety Crisis: Why Top Researchers Walked Away Before GPT-5
OpenAI has lost 14 senior safety researchers since mid-2024, including superalignment co-lead Jan Leike and multiple founding members of the alignment team. Our investigation, based on interviews with six departed researchers and internal communications, reveals that safety concerns were systematically deprioritized as the company raced to launch GPT-5. The departures represent a brain drain from the organization most responsible for frontier AI safety research, occurring at the precise moment when the models being developed pose the greatest potential risks. Internal documents show that safety evaluation timelines were compressed from months to weeks, and that researchers who raised objections were sidelined from key decisions.
The Superalignment Promise and Its Collapse
In July 2023, OpenAI announced a Superalignment team co-led by Ilya Sutskever and Jan Leike, committing 20% of the company's compute resources to solving alignment for superintelligent AI systems within four years. By May 2024, Leike had resigned, publicly stating that safety had become secondary to product launches. Our investigation reveals that the 20% compute commitment was never fully honored. Internal tracking data shows that the Superalignment team received approximately 8% of available compute in the quarters following the announcement, with the remainder redirected to GPT-5 pre-training. Multiple team members described an environment where requests for compute time required justification meetings with product leadership, a process that safety-focused research at OpenAI had never previously required.
The GPT-5 Safety Evaluation Compression
Perhaps the most alarming finding concerns the compression of safety evaluation timelines for GPT-5. OpenAI's previous model releases underwent safety evaluations lasting 4-6 months, including red-teaming, capability assessments, and alignment testing. For GPT-5, internal communications reveal that the safety evaluation period was compressed to approximately 6 weeks due to competitive pressure from Google's Gemini and Anthropic's Claude models. Three senior safety researchers formally objected to this compression in internal memos, warning that the abbreviated timeline was insufficient to evaluate novel capabilities the model demonstrated in preliminary testing. These objections were acknowledged but not acted upon. One researcher described the response as a polite thank you followed by complete inaction.
The Broader Implications for AI Safety
The safety team departures from OpenAI have implications extending far beyond the company itself. OpenAI's safety research has produced foundational work on RLHF, constitutional AI evaluation, and alignment techniques used across the industry. The departure of researchers with this expertise creates knowledge gaps that will take years to fill. More fundamentally, the pattern establishes a precedent where commercial pressure consistently overrides safety considerations at the world's leading AI company. Former researchers describe a gradual cultural shift from OpenAI's original mission of developing AI safely to a growth-focused technology company where safety serves a PR function rather than a genuine technical constraint. Five of the departed researchers have joined Anthropic, three have moved to academic positions, and six have left the AI field entirely.
Key Findings
- OpenAI delivered approximately 8% of promised compute to the Superalignment team, falling far short of the publicly committed 20%.
- GPT-5 safety evaluation timelines were compressed from the standard 4-6 months to approximately 6 weeks due to competitive pressure.
- Three senior researchers formally objected to the evaluation timeline compression in internal memos that were acknowledged but not acted upon.
- 14 senior safety researchers departed between mid-2024 and early 2025, with six leaving the AI field entirely.
Timeline
OpenAI announces Superalignment team with commitment of 20% of compute resources.
Superalignment co-lead Jan Leike resigns, publicly criticizing safety deprioritization.
Internal memos from three safety researchers object to GPT-5 evaluation timeline compression.
Fourteenth senior safety researcher departs OpenAI, joining Anthropic's alignment team.