Skip to content

Evaluation

Content Safety Evaluation

This notebook demonstrates how to evaluate content safety using Azure AI's evaluation tools. It includes steps to:

  • Simulate content safety and grounded scenarios.
  • Evaluate content for safety metrics such as violence, sexual content, hate/unfairness, and self-harm.
  • Generate evaluation reports in JSON format.

Prerequisites

  • Azure AI project credentials.
  • Python 3.9+
  • Python environment with required libraries installed (azure-ai-evaluation, pandas, etc.).
  • Access to the Azure API endpoint.

If you did not create a virtual environment during the deployment, please follow the steps here
1. Navigate to the workshop/docs/workshop folder in the terminal in your local repository and run the following commands 2. In the terminal run the following command

  • Install the requirements
    Bash
    1
    pip install -r requirements.txt
    
  • Open the .env in the workshop/docs/workshop folder to validate the variables were updated with the details of your solution.
  • Open the Content_safety_evaluation notebook
  • Run the first cell to create a folder for the output file of the evaluations.
  • Run cells 2-4 to initialize your Azure AI Project, the call streaming function and callback function.
  • Cell 5 run the Adversarial Scenario to generate questions, run the questions against your AI solution and write these results to a local file. Cell 6 will format the output of the results.
    • The Adversarial Scenario will run content safety evaluation tests on your AI solution
  • Cell 7 and 8 initialize the model configuration and the Groundedness Evaluator. The groundedness measure assesses the correspondence between claims in an AI-generated answer and the source context, making sure that these claims are substantiated by the context.
    • Learn more about the groundedness evaluator here