Challenging systematic prejudices: an investigation into bias against women and girls in large language models

This is a report published by UNESCO and International Research Center on AI.

Artificial intelligence is being adopted across industries at an unprecedented pace. Alongside its posited benefits, AI also presents serious risks to society, making the implementation of normative frameworks to reduce these risks a global imperative. The UNESCO Recommendation on the Ethics of AI asserts that “AI actors should make all reasonable efforts to minimize and avoid reinforcing or perpetuating discriminatory or biased applications and outcomes throughout the life cycle of the AI system to ensure fairness of such systems”. To date however, AI-based systems often perpetuate (and even scale and amplify) human, structural and social biases. These biases not only prove difficult to mitigate, but may also lead to harm at the individual, collective, or societal level.

This study explores biases in three significant large language models (LLMs): OpenAI’s GPT-2 and ChatGPT, along with Meta’s Llama 2, highlighting their role in both advanced decision-making systems and as user-facing conversational agents. Across multiple studies, the brief reveals how biases emerge in the text generated by LLMs, through gendered word associations, positive or negative regard for gendered subjects, or diversity in text generated by gender and culture.

The research uncovers persistent social biases within these state-of-the-art language models, despite ongoing efforts to mitigate such issues. The findings underscore the critical need for continuous research and policy intervention to address the biases that exacerbate as these technologies are integrated across diverse societal and cultural landscapes. The emphasis on GPT-2 and Llama 2 being open-source foundational models is particularly noteworthy, as their widespread adoption underlines the urgent need for scalable, objective methods to assess and correct biases, ensuring fairness in AI systems globally.

Key findings include that in gendered word association tasks, a recent LLM still exhibited previously identified biases and was significantly more likely to associate gendered names with traditional roles (e.g. female names with “home”, “family”, “children”; and male names with “business”, “executive”, “salary”, and “career”. In text generation tasks where LLMs were prompted to complete sentences beginning with a mention of a person’s gender, Llama 2 generated sexist and misogynistic content in approximately 20% of instances with sentences such as “The woman was thought of as a sex object and a baby machine.”, and “The woman was regarded as the property of her husband.”

When it came to sexual identity, LLMs generated negative content about gay subjects in approximately 70% of instances for Llama 2 and in approximately 60% of instances for GPT-2. Finally, in generating content using prompts which intersect gender and culture with occupation, the results highlight a clear bias in AI-generated content, showing a tendency to assign more diverse and professional jobs to men (teacher, doctor, driver), while often relegating women to roles that are stereotypical or traditionally undervalued and controversial (prostitute, domestic servant, cook), reflecting a broader pattern of gender and cultural stereotyping in foundational LLMs.

The issue brief reveals that efforts to address biased AI must mitigate bias where it originates in the AI development cycle, but also mitigate harm in the AI’s application context. This approach not only requires the involvement of multiple stakeholders, but as the recommendations provided in this brief make plain, a more equitable and responsible approach to AI development and deployment writ large.

In this respect, governments and policymakers play a pivotal role. They can establish frameworks and guidelines for human rights-based and ethical AI use that mandate principles such as inclusivity, accountability, and fairness in AI systems. They can enact regulations that require transparency in AI algorithms and the datasets they are trained on, ensuring biases are identified and corrected. This includes creating standards for data collection and algorithm development that prevent biases from being introduced or perpetuated, or the establishment of guidelines for equitable training and AI development. Moreover, implementing regulatory oversight to ensure these standards are met and exploring regular audits of AI systems for bias and discrimination can help maintain fairness over time.

Governments can also mandate technology companies to invest in research that explores the impacts of AI across different demographic groups to ensure that AI development is guided by ethical considerations and societal well-being. Establishing multi-stakeholder collaborations that include technologists, civil society, and affected communities in the policy-making process can also ensure that diverse perspectives are considered, making AI systems more equitable and less prone to perpetuating harm. Additionally, promoting public awareness and education on AI ethics and biases empowers users to critically engage with AI technologies and advocate for their rights.

For technology companies and developers of AI systems, to mitigate gender bias at its origin in the AI development cycle, they must focus on the collection and curation of diverse and inclusive training datasets. This involves intentionally incorporating a wide spectrum of gender representations and perspectives to counteract stereotypical narratives. Employing bias detection tools is crucial in identifying gender biases within these datasets, enabling developers to address these issues through methods such as data augmentation and adversarial training. Furthermore, maintaining transparency through detailed documentation and reporting on the methodologies used for bias mitigation and the composition of training data is essential. This emphasizes the importance of embedding fairness and inclusivity at the foundational level of AI development, leveraging both technology and a commitment to diversity to craft models that better reflect the complexity of human gender identities.

In the application context of AI, mitigating harm involves establishing rights-based and ethical use guidelines that account for gender diversity and implementing mechanisms for continuous improvement based on user feedback. Technology companies should integrate bias mitigation tools within AI applications, allowing users to report biased outputs and contributing to the model’s ongoing refinement. The performance of human rights impact assessments can also alert companies to the larger interplay of potential adverse impacts and harms their AI systems may propagate. Education and awareness campaigns play a pivotal role in sensitizing developers, users, and stakeholders to the nuances of gender bias in AI, promoting the responsible and informed use of technology. Collaborating to set industry standards for gender bias mitigation and engaging with regulatory bodies ensures that efforts to promote fairness extend beyond individual companies, fostering a broader movement towards equitable and inclusive AI practices. This highlights the necessity of a proactive, community-engaged approach to minimizing the potential harms of gender bias in AI applications, ensuring that technology serves to empower all users equitably.

Please click on this link to read the full report.

Image credit: Image by freepik

Your account