The Singapore Consensus on Global AI Safety Research Priorities

24/05/2025
10:13 pm

The Singapore Consensus on Global AI Safety Research Priorities

Rapidly improving AI capabilities and autonomy are driving a vigorous debate on how to keep AI safe, secure and beneficial. While regulatory approaches remain under active deliberation, the global research community demonstrates substantial consensus around specific high-value technical AI safety research domains. Because of this, there are tremendous opportunities to secure the benefits of AI, improve our understanding of risk management, and foster international research collaboration.

Introduction

Building a Trusted Ecosystem

Rapidly improving AI capabilities and autonomy hold significant promise of transformation, but are also driving vigorous debate on how to ensure that AI is safe, i.e., trustworthy, reliable, and secure. Building a trusted ecosystem is therefore essential – it helps people embrace AI with confidence and gives maximal space for innovation while avoiding backlash. This requires policymakers, industry, researchers and the broader public to collectively work toward securing positive outcomes from AI’s development. AI safety research is a key dimension. Given that the state of science today for building trustworthy AI does not fully cover all risks, accelerated investment in research is required to keep pace with commercially driven growth in system capabilities.

Goals

The 2025 Singapore Conference on AI (SCAI): International Scientific Exchange on AI Safety aims to support research in this important space by bringing together AI scientists across geographies to identify and synthesise research priorities in AI safety. The result, The Singapore Consensus on Global AI Safety Research Priorities, builds on the International AI Safety Report-A (IAISR) chaired by Yoshua Bengio and backed by 33 governments. By adopting a defence-in-depth model, this document organises AI safety research domains into three types: challenges with creating trustworthy AI systems (Development), challenges with evaluating their risks (Assessment), and challenges with monitoring and intervening after deployment (Control).

Through the Singapore Consensus, we hope to globally facilitate meaningful conversations between AI scientists and AI policymakers for maximally beneficial outcomes. Our goal is to enable more impactful R&D efforts to rapidly develop safety and evaluation mechanisms and foster a trusted ecosystem where AI is harnessed for the public good.

Areas of mutual interest: While companies and nations often compete on AI research and development, there are also incentives to find alignment and common interests. This synthesis covers areas where different parties may be competitive, but also highlights examples from the broader landscape of areas of mutual interest – research products and information that developers would be interested to share widely (Bucknall-B). Certain safety advances offer minimal competitive edge while serving a common interest – similar to how competing aircraft manufacturers (e.g., Boeing and Airbus) collaborate on aviation safety information and standards. In AI, particular areas for potentially mutually-beneficial cooperation span sections 1-3 of this report and include certain verification mechanisms, risk-management standards, and risk evaluations (Bucknall-B). The motivation is clear: no organisation or country benefits when AI incidents occur or malicious actors are enabled, as the resulting harm would damage everyone collectively.

Process

This document represents a comprehensive synthesis of research proposals drawn from the International AI Safety Report-B and complementary recent research prioritisation frameworks, including UK AISI, Anthropic-F, Anwar, Bengio-A, GDM, Hendrycks-A, Ji, Li-A,OpenAI-B, NIST, Reuel, Slattery, and Weidinger-A. Initially designed as a consultation draft by the Expert Planning Committee (Dawn Song, Lan Xue, Luke Ong, Max Tegmark, Stuart Russell, Tegan Maharaj, Ya-Qin Zhang, and Yoshua Bengio), it was distributed to all conference participants to solicit comprehensive feedback. Following several rounds of updates based on further participant feedback in writing and in person, this document has been designed to synthesise points of broad consensus among diverse researchers. The full list of conference participants who contributed to this Singapore Consensus process is presented at the beginning of this document, and includes researchers from leading academic institutions and AI developers, as well as representatives from governments and civil society.

Key event: 26th April 2025 – SCAI: International Scientific Exchange on AI Safety.
Contributors: More than 100 participants in attendance for discussion and feedback.
Representation: Participants from 11 countries were present.

We have attempted to be inclusive of both terminology and research topic suggestions from researchers in academia, industry, and civil society. This synthesis presented unique challenges because different authors have used a variety of non-equivalent definitions and classification schemes. This report therefore takes a humble approach: the definitions of key terms in Table 1 below simply specify how we use various terms in this report, to avoid confusion, and we make no claims whatsoever to these being better than other alternative definitions.

Scope

We limit our discussion to technical AI safety research, focused on making AI more trustworthy rather than merely more powerful, and excluding AI policy research. We focus primarily on general-purpose AI: Following the International AI Safety Report, the term ‘AI systems’ in this document should be understood to refer to general-purpose AI (GPAI) systems – systems that can perform or can be adapted to perform a wide range of tasks (IAISR). This includes language models that produce text (e.g. chat systems) as well as ‘multimodal’ models which can work with multiple types of data, often including text, images, video, audio, and robotic actions. Importantly, it includes general-purpose agents – systems that autonomously act and plan to accomplish complex tasks, for example by controlling computers. Developing more powerful agents is a core focus of AI developers as their growing deployment presents new major risks and opportunities.

We emphasise that technical solutions relating to general-purpose AI systems are necessary but not sufficient for the overall safety of AI. Our collective ability to responsibly manage AI risks and opportunities will ultimately depend on our choices to build a healthy AI ecosystem, study risks, implement mitigations, and integrate solutions into effective risk management frameworks.

Structure

Inspired by the 2025 International AI Safety Report (IAISR), this document adopts a defence-in-depth model and groups technical AI safety research topics into three broad areas from risk assessment that informs subsequent development and deployment decisions, to technical methods in the system development phase, and tools for control after a system has been deployed. The three identified areas have interesting overlaps as illustrated in Figure 1:

Risk Assessment: The primary goal of risk assessment is to understand the severity and likelihood of a potential harm. Risk assessments are used to prioritise risks and determine if they cross thresholds that demand specific action. Consequential development and deployment decisions are predicated on these assessments. The research areas in this category involve developing methods to measure the impact of AI systems for both current and future AI, enhancing metrology to ensure that these measurements are precise and repeatable, and building enablers for third-party audits to support independent validation of these risk assessments.
Development: AI systems that are trustworthy, reliable and secure by design give people the confidence to embrace and adopt AI innovation. Following the classic safety engineering framework, the research areas in this category involves specifying the desired behaviour, designing an AI system that meets the specification, and verifying that the system meets its specification.
Control: In engineering, “control” usually refers to the process of managing a system’s behaviour to achieve a desired outcome, even when faced with disturbances or uncertainties, and often in a feedback loop. The research areas in this category involve developing monitoring and intervention mechanisms for AI systems, extending monitoring mechanisms to the broader AI ecosystem to which the AI system belongs, and societal resilience research to strengthen societal infrastructure (e.g. economic, security) to adapt to AI-related societal changes.

Please click on this link to read the full article.

Image credit: Image by fullvector on Freepik