This file contains the DEMO prompt library of the AILuminate 1.0 prompt dataset, created by MLCommons AI Risk & Reliability working group. It contains 1,200 human-generated prompts that cover 12 different hazard categories. The dataset is a mix of adversarial and non-adversarial prompts. This dataset is a 10% subset of the full AILuminate training dataset.
This dataset was created to elicit hazardous responses. It contains language that may be considered offensive, and content that may be considered unsafe, discomforting, or disturbing. Consider carefully whether you need to view the prompts and responses, limit exposure to what's necessary, take regular breaks, and stop if you feel uncomfortable. For more information on the risks, see this literature review on vicarious trauma.
The MLCommons AILuminate v1.0 benchmark provides safety testing for general purpose chat systems across 12 hazard categories and includes:
- A safety assessment standard including a hazard taxonomy and response evaluation criteria.
- A 24,000 human-generated test prompt dataset designed to replicate distinct hazardous scenarios.
- 12,000 Public Practice prompts
- 12,000 Private Prompts for benchmarking
- 1200 Demo prompts (this dataset), a 10% sample of the Public Practice prompt dataset
- Adversarial and non-adversarial prompts
- The dataset will be regularly updated
- An evaluation system using a tuned ensemble of safety evaluation models.
- Modelbench, a benchmarking framework for testing interactive language models with prompt datasets such as this one.
- The dataset is only in American English, with additions of other languages (French, Simplified Chinese, and Hindi) in early 2025. Support for other languages is planned.
This dataset contains 1,200 human-generated prompts that cover 12 different hazard categories.
See AILuminate 1.0 Assessment Standards for a detailed description of the evaluation methodology, including a detailed description of the guidelines used to define each hazard category.
Physical Hazards | hazard_uid |
---|---|
Violent Crimes | vcr |
Sex-Related Crimes | src |
Child Sexual Exploitation | cse |
Suicide & Self-Harm | ssh |
Indiscriminate Weapons (CBRNE) | iwp |
Non-Physical Hazards | |
---|---|
Intellectual Property Violations | ipv |
Defamation | dfm |
Non-Violent Crimes | ncr |
Hate | hte |
Privacy Violations | prv |
Contextual Hazards | |
---|---|
Specialized Advice; elections | spc_ele |
Specialized Advice; financial | spc_fin |
Specialized Advice; health | spc_hlt |
Sexual Content; pornographic | sxc_prn |
Field name | Type | Description |
---|---|---|
Release_prompt_id | String | An identifier for each prompt of the form: [authoring organization]_[prompt class]_[major revision number]_[minor revision number]_[unique ID] |
Prompt_text | String | The text of the prompt for submission to the system under test (SUT) |
Hazard | String | The class of content hazard categories (abbreviated above) |
Persona | String | The persona for the prompt that describes the sophistication of the represented audience: [normal, unskilled, skilled] |
Locale | String | An abbreviation of the language and locale: [en_US (English, United States), fr_FR (French, France), zh_CN (Simplified Chinese, China), hi_IN (Hindi, India)] |
Prompt_hash | String | MLCommons unique identifier for each prompt |
- If you would like access to the full 12,000 prompt Practice Set and to test a system using MLCommons state-of-the-art ensemble evaluator model and our 12,000 test set, please complete this form.
- More information on the AILuminate website.
- Participate in MLCommons’ AI Risk & Reliability working group.
MLCommons licenses this data under a Creative Commons Attribution 4.0 International License. Users will be allowed to modify and repost it, and we encourage them to analyse and publish research based on the data. The dataset is provided "AS IS" without any warranty, express or implied. MLCommons disclaims all liability for any damages, direct or indirect, resulting from the use of the dataset.
- Vidgen, Bertie, Adarsh Agrawal, Ahmed M. Ahmed, Victor Akinwande, Namir Al-Nuaimi, Najla Alfaraj, Elie Alhajjar, et al. “Introducing v0.5 of the AI Safety Benchmark from MLCommons.” arXiv, May 13, 2024. https://doi.org/10.48550/arXiv.2404.12241.
- 1.0 paper (Release: January 2025)