The rapid advancement of AI capabilities has far outpaced safety research, creating a dangerous imbalance—estimates suggest roughly 300 capabilities researchers exist for every safety researcher. The field urgently needs more experts who understand modern AI systems to work on safety, yet skilled machine learning researchers face significant barriers when attempting to transition between these domains—from unclear pathways to concerns about career stability and compensation.
One way to address this gap could involve creating specialized pathways for experienced machine learning researchers—particularly those working with frontier models—to move into AI safety roles. This would differ from general career resources by assuming deep technical expertise and focusing on transition challenges specific to established researchers.
Key components might include:
The program would primarily benefit ML researchers at major labs who want to work on safety but lack transition options, while simultaneously helping safety organizations gain much-needed technical talent familiar with cutting-edge AI systems.
A phased approach could start with understanding researcher needs through interviews, followed by pilot fellowships and partnerships with safety organizations. Critical to the process would be establishing:
Unlike existing career guidance programs that cater to beginners, this approach would specifically target the transition challenges faced by established researchers. By focusing narrowly on this bottleneck—the movement of already-skilled researchers from capabilities to safety work—it could potentially create disproportionate safety benefits relative to the resources invested.
The project would require validating key assumptions about researcher willingness to transition and safety organizations' ability to absorb new talent, potentially through initial surveys and interviews. Funding might come from longtermist donors concerned about AI risk mitigation.
Hours To Execute (basic)
Hours to Execute (full)
Estd No of Collaborators
Financial Potential
Impact Breadth
Impact Depth
Impact Positivity
Impact Duration
Uniqueness
Implementability
Plausibility
Replicability
Market Timing
Project Type
Service