AI Safety Index – Summer 2025
Source: Future of Life Institute
AI systems are growing increasingly powerful as tech companies drive toward artificial general intelligence (AGI) and beyond. Just as functioning breaks give drivers the confidence to accelerate, effective AI safety measures give society the confidence to innovate and adopt AI. Competitive pressures can incentivize a race to the bottom that prioritizes profits over safety, so to improve incentives, the Future of Life Institute periodically convenes an independent panel of leading AI experts to conduct a comprehensive safety review of prominent tech companies.
The Future of Life Institute’s AI Safety Index provides an independent assessment of seven leading AI companies’ efforts to manage both immediate harms and catastrophic risks from advanced AI systems. The Index aims to strengthen incentives for responsible AI development and to close the gap between safety commitments and real-world actions. The Summer 2025 version of the Index evaluates seven leading AI companies on an improved set of 33 indicators of responsible AI development and deployment practices, spanning six critical domains.
Data Collection: Evidence was gathered between March 24 and June 24, 2025 through systematic desk research and a targeted company survey. We prioritized official materials released by the companies about their AI systems and risk management practices, while also incorporating external safety benchmarks, credible media reports, and independent research. To address transparency gaps in the industry, we distributed a 34-question survey on May 28 (which was due on June 17) focusing on areas where public disclosure remains limited—particularly whistleblowing policies, third-party model evaluations, and internal AI deployment practices.
Expert Evaluation: An independent panel of distinguished AI scientists and governance experts evaluated the collection of evidence between June 24 and July 9, 2025. Panel members were selected for their domain expertise and absence of conflicts of interest.
Each expert assigned letter grades (A+ to F) per domain to each of the companies. These grades were based on a set of fixed performance standards. Experts also gave each grade a brief written justification, and gave each company specific recommendations for improvement. Reviewers had full flexibility to weight the various indicators according to their judgment. Not every reviewer graded every domain, but experts were invited to score domains relevant to their area of expertise. Final scores were calculated by averaging all expert grades within each domain, with overall company grades representing the unweighted average across all six domains. Individual reviewer grades remain confidential to ensure candid assessment.
Please click here to read the full report.
Image credit: Image by rawpixel.com on Freepik