San Francisco, CA · $350,000-500,000/yr
[Expression of Interest] Research Manager, Interpretability
Apply NowAbout Anthropic
Anthropic's mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group of committed researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems. Note: We don't have open Research Manager positions on the Interpretability team at this time. However, we're actively growing our team of Research Engineers and Research Scientists. If you're excited about interpretability research and open to an individual contributor role, we encourage you to apply.
About the Interpretability Team
When you see what modern language models are capable of, do you wonder, "How do these things work? How can we trust them?" The Interpretability team's mission is to reverse engineer how trained models work, and Interpretability research is one of Anthropic's core research bets on AI safety. We believe that a mechanistic understanding is the most robust way to make advanced systems safe. We're focused on mechanistic interpretability, which aims to discover how neural network parameters map to meaningful algorithms. We're trying to do "biology" or "neuroscience" of neural networks, or treating neural networks as binary computer programs we're trying to "reverse engineer."
About the Role
As a manager on the Interpretability team, you'll support a team of expert researchers and engineers who are trying to understand at a deep, mechanistic level, how modern large language models work internally. Your work as manager will be critical in making sure that our fast-growing team is able to meet its ambitious safety research goals over the coming years. In this role, you will partner closely with an individual contributor research lead to drive the team's success, translating cutting-edge research ideas into tangible goals and overseeing their execution. You will manage team execution, careers and performance, facilitate relationships within and across teams, and drive the hiring pipeline.
Responsibilities
• Partner with a research lead on direction, project planning and execution, hiring, and people development • Set and maintain a high bar for execution speed and quality, including identifying improvements to processes that help the team operate effectively • Coach and support team members to have more impact and develop in their careers • Drive the team's recruiting efforts, including hiring planning, process improvements, and sourcing and closing • Help identify and support opportunities for collaboration with other teams across Anthropic • Communicate team updates and results to other teams and leadership • Maintain a deep understanding of the team's technical work and its implications for AI safety
You May Be a Good Fit If You
• Are an experienced manager (minimum 2-5 years) with a track record of effectively leading highly technical research and/or engineering teams • Have a background in machine learning, AI, or a related technical field • Actively enjoy people management and are experienced with coaching and mentorship, performance evaluation, career development, and hiring for technical roles • Have strong project management skills, including prioritization and cross-functional coordination and collaboration • Have managed technical teams through periods of ambiguity and change • Are a quick learner, capable of understanding and contributing to discussions on complex technical topics and are motivated to learn about our research • Are a strong communicator both in speaking and in writing • Believe that advanced AI systems could have a transformative effect on the world, and are passionate about helping make sure that transformation goes well
Strong Candidates May Also Have
• Experience scaling engineering infrastructure • Experience working on open-ended, exploratory research agendas aimed at foundational insights • Some familiarity with our work and mechanistic interpretability
Role Specific Location Policy
This role