Recent research
We analyze what AI evaluations can and cannot do for preventing catastrophic risks. While they can help determine lower bounds on AI capabilities, evaluations face fundamental limitations and cannot be solely relied upon to ensure safety.
Mechanisms to Verify International Agreements About AI Development
In this research report we provide an in-depth overview of the mechanisms that could be used to verify adherence to international agreements about AI development.
Response to BIS AI Reporting Requirements RFC
We respond to the BIS Request for Comment on the Proposed Rule for Establishment of Reporting Requirements for the Development of Advanced Artificial Intelligence Models and Computing Clusters.
Technical Governance Team Mission
AI systems are rapidly becoming more capable, and fundamental safety problems remain unsolved. Our goal is to increase the probability that humanity can safely navigate the transition to a world with smarter-than-human AI, focusing on technical research in service of governance goals.
governance goals:
Coordination
Strengthen international coordination to allow for effective international agreements and reduce dangerous race dynamics.
Security
Establish robust security standards and practices for frontier AI development, to reduce harms from model misalignment, misuse, and proliferation.
Development Safeguards
Ensure that dangerous AI development could be shut down, if there was broad international agreement on the need for this.
Regulation
Improve and develop domestic and international regulation of frontier AI, to prepare for coming risks, and identify when current safety plans are likely to fail.
Frequently asked questions
Common questions on AI governance, collaboration, and future risks
We focus on technical research to improve concrete proposals in AI governance. This includes research into verification mechanisms for national regulation and international AI agreements, concrete proposals for regulation, technical criteria and levels for stopping dangerous AI development, threat modeling (especially on autonomy), forecasting, and technical problems with safety plans and AI evaluations. We focus less on other AI governance areas such as institution design, law, and economics.
Our team is focused on producing technical reports and engaging with policymakers to inform governance decisions and concrete policy proposals. We do not focus on informing the general public about AI risk, advocacy, or research on aligning AI systems.
- We are focused on preventing catastrophic and extinction-level risks from advanced AI. This includes risks where there is a significant chance that all humans die or are permanently disempowered. A smarter-than-human AI system misaligned with human interests may have many paths to causing this outcome.
- We are also concerned with catastrophic risks from human misuse of powerful AI, although this is not our primary focus.
- We have significant uncertainty about when smarter-than-human AI will be developed, and there are differing views within the team. However, we believe that there is a significant chance that smarter-than-human AI will be developed before 2030. Focusing on shorter development trajectories is also leveraged, as these worlds are likely to be more dangerous, chaotic, and have less time to do necessary safety work.
- For example, we often consider when AI systems will be able to effectively replace human remote workers, especially if they would be able to substitute for AI researchers as this could lead to a research feedback loop.