The Machine Intelligence Research Institute (MIRI) has submitted a response to the U.S. Artificial Intelligence Safety Institute (AISI) and National Institute of Standards and Technology (NIST) regarding their draft guidance on "Managing Misuse Risk for Dual-Use Foundation Models." We commend NIST and AISI for this important step towards addressing potential risks associated with increasingly capable AI systems.We acknowledge the guidance's focus on misuse risks by malicious actors, while noting the need for future guidance on accident and misalignment risks. We appreciate the efforts to ensure AI developers consider key challenges in mapping and measuring misuse risks. However, we emphasize that the current state of understanding AI systems and threat modeling means there is significant uncertainty in implementing these recommendations effectively. We caution that AI evaluations may lead to a false sense of security due to this uncertainty.We suggest several improvements to the guidance, including being more explicit about the uncertainty in AI capabilities assessment and using more precise language that distinguishes between current capabilities and future goals of AI risk management. We stress that uncertainty often calls for greater caution, especially when dealing with powerful AI models.
Our recommendations include enhancing the documentation process by specifying the intended audience for each piece and making documentation public by default, with appropriate risk assessments for information sharing. We also suggest that risk thresholds should be subject to third-party review and include specific, quantitative measures.We propose several modifications to the objectives and practices outlined in the guidance. These include adding recommendations for concrete "red lines" in AI development, developing measures to test the adequacy of safeguards, maintaining a track record of capability predictions, and establishing protocols for swift de-deployment of models found to be unacceptably dangerous.