Evaluation¶

Content Safety Evaluation¶

This notebook demonstrates how to evaluate content safety using Azure AI's evaluation tools. It includes steps to:

Simulate content safety and grounded scenarios.
Evaluate content for safety metrics such as violence, sexual content, hate/unfairness, and self-harm.
Generate evaluation reports in JSON format.

Prerequisites¶

Azure AI project credentials.
Python 3.9+
Python environment with required libraries installed (azure-ai-evaluation, pandas, etc.).
Access to the Azure API endpoint.

If you did not create a virtual environment during the deployment, please follow the steps here
1. Navigate to the workshop/docs/workshop folder in the terminal in your local repository and run the following commands 2. In the terminal run the following command

Install the requirements
Bash
1
pip install -r requirements.txt
Open the .env in the workshop/docs/workshop folder to validate the variables were updated with the details of your solution.
Open the Content_safety_evaluation notebook
Run the first cell to create a folder for the output file of the evaluations.
Run cells 2-4 to initialize your Azure AI Project, the call streaming function and callback function.
Cell 5 run the Adversarial Scenario to generate questions, run the questions against your AI solution and write these results to a local file. Cell 6 will format the output of the results.
- The Adversarial Scenario will run content safety evaluation tests on your AI solution
Cell 7 and 8 initialize the model configuration and the Groundedness Evaluator. The groundedness measure assesses the correspondence between claims in an AI-generated answer and the source context, making sure that these claims are substantiated by the context.
- Learn more about the groundedness evaluator here