Evaluation¶
Content Safety Evaluation¶
This notebook demonstrates how to evaluate content safety using Azure AI's evaluation tools. It includes steps to:
- Simulate content safety and grounded scenarios.
- Evaluate content for safety metrics such as violence, sexual content, hate/unfairness, and self-harm.
- Generate evaluation reports in JSON format.
Prerequisites¶
- Azure AI project credentials.
- Python 3.9+
- Python environment with required libraries installed (
azure-ai-evaluation
,pandas
, etc.). - Access to the Azure API endpoint.
If you did not create a virtual environment during the deployment, please follow the steps here
1. Navigate to the workshop/docs/workshop
folder in the terminal in your local repository and run the following commands
2. In the terminal run the following command
- Install the requirements
Bash 1
pip install -r requirements.txt
- Open the
.env
in theworkshop/docs/workshop
folder to validate the variables were updated with the details of your solution. - Open the Content_safety_evaluation notebook
- Run the first cell to create a folder for the output file of the evaluations.
- Run cells 2-4 to initialize your Azure AI Project, the call streaming function and callback function.
- Cell 5 run the Adversarial Scenario to generate questions, run the questions against your AI solution and write these results to a local file. Cell 6 will format the output of the results.
- The Adversarial Scenario will run content safety evaluation tests on your AI solution
- Cell 7 and 8 initialize the model configuration and the Groundedness Evaluator. The groundedness measure assesses the correspondence between claims in an AI-generated answer and the source context, making sure that these claims are substantiated by the context.
- Learn more about the groundedness evaluator here