Azure AI Content Safety for Python: Setup, Usage & Best Practices
Azure AI Content Safety for Python provides automated harmful content detection across text and images with multi-category severity scoring. This skill enables Python applications to moderate user-generated and AI-generated content for hate speech, violence, sexual content, and self-harm without building custom ML models.
What This Skill Does
This SDK analyzes text strings and image files, returning severity scores (0-6) across four harm categories. The service uses ML models trained on diverse datasets to detect patterns associated with each category: identity-based attacks (hate), physical harm references (violence), sexual content, and self-harm mentions. Results include per-category severity levels and confidence scores, enabling you to implement graduated moderation policies based on risk tolerance.
The SDK supports two primary workflows: content analysis via ContentSafetyClient and blocklist management via BlocklistClient. Content analysis is synchronous—send text or images, receive results immediately. Blocklists provide exact-match term filtering to supplement ML detection with domain-specific keywords, profanity, or brand protection terms. When blocklists match, the service returns the matched term and its ID, useful for audit logging and policy refinement.
For text, you can choose between 4-level (0, 2, 4, 6) or 8-level (0-7) severity granularity. Images only support 4-level scoring. The halt-on-blocklist-hit option lets you stop analysis immediately upon finding a blocked term, reducing latency when exact-match filtering is your primary concern.
Getting Started
Install the SDK via pip:
pip install azure-ai-contentsafety
Configure environment variables:
export CONTENT_SAFETY_ENDPOINT=https://<resource>.cognitiveservices.azure.com
export CONTENT_SAFETY_KEY=<your-api-key>
Create a client with API key authentication:
from azure.ai.contentsafety import ContentSafetyClient
from azure.core.credentials import AzureKeyCredential
import os
client = ContentSafetyClient(
endpoint=os.environ["CONTENT_SAFETY_ENDPOINT"],
credential=AzureKeyCredential(os.environ["CONTENT_SAFETY_KEY"])
)
For production, use DefaultAzureCredential for managed identity support.
Key Features
Four Harm Categories: Analyze content across hate, self-harm, sexual, and violence categories. Each returns independent severity scores, enabling category-specific moderation policies.
Multi-Level Severity Scoring: Text supports 4-level (default) or 8-level granularity. Higher granularity enables nuanced moderation—block high severity, flag medium for review, allow low.
Custom Blocklists: Create term lists for exact-match filtering. Useful for domain-specific profanity, brand names, or community-specific slang not covered by general ML models.
Image and Text Analysis: Analyze text strings or images (from files or URLs). Images support formats like JPEG, PNG, GIF via base64 encoding or public URLs.
Blocklist Match Details: When terms match, receive the blocklist name, item ID, and matched text for detailed audit trails.
Entra ID Support: Authenticate via Azure Active Directory using DefaultAzureCredential, enabling passwordless auth with managed identities.
Usage Examples
Basic Text Analysis: Detect harmful content with default settings:
from azure.ai.contentsafety.models import AnalyzeTextOptions, TextCategory
request = AnalyzeTextOptions(text="Text to analyze for harmful content")
response = client.analyze_text(request)
for category in [TextCategory.HATE, TextCategory.SELF_HARM,
TextCategory.SEXUAL, TextCategory.VIOLENCE]:
result = next((r for r in response.categories_analysis
if r.category == category), None)
if result:
print(f"{category}: severity {result.severity}")
Image Analysis from File:
import base64
from azure.ai.contentsafety.models import AnalyzeImageOptions, ImageData
with open("image.jpg", "rb") as f:
image_data = base64.b64encode(f.read()).decode("utf-8")
request = AnalyzeImageOptions(image=ImageData(content=image_data))
response = client.analyze_image(request)
for result in response.categories_analysis:
if result.severity >= 4:
print(f"WARNING: {result.category} severity {result.severity}")
Image Analysis from URL:
request = AnalyzeImageOptions(
image=ImageData(blob_url="https://example.com/image.jpg"))
response = client.analyze_image(request)
Create Blocklist and Add Terms:
from azure.ai.contentsafety import BlocklistClient
from azure.ai.contentsafety.models import TextBlocklist, AddOrUpdateTextBlocklistItemsOptions, TextBlocklistItem
blocklist_client = BlocklistClient(endpoint, AzureKeyCredential(key))
# Create blocklist
blocklist = TextBlocklist(
blocklist_name="profanity-filter",
description="Custom profanity terms"
)
blocklist_client.create_or_update_text_blocklist(
blocklist_name="profanity-filter", options=blocklist)
# Add terms
items = AddOrUpdateTextBlocklistItemsOptions(blocklist_items=[
TextBlocklistItem(text="badword1"),
TextBlocklistItem(text="badword2")
])
blocklist_client.add_or_update_blocklist_items(
blocklist_name="profanity-filter", options=items)
Analyze with Blocklist:
request = AnalyzeTextOptions(
text="Text containing badword1",
blocklist_names=["profanity-filter"],
halt_on_blocklist_hit=True
)
response = client.analyze_text(request)
if response.blocklists_match:
for match in response.blocklists_match:
print(f"Blocked: {match.blocklist_item_text}")
Best Practices
Set Category-Specific Thresholds: Different content types tolerate different risk levels. News platforms might allow violence (severity 2-4) but block hate speech (severity >= 2). Gaming platforms might tolerate violence differently.
Use 8-Level Scoring for Nuance: When implementing review queues, 8-level scoring provides gradations for human review (severity 3-5) versus automatic blocking (>=6) versus auto-approval (0-2).
Wait for Blocklist Propagation: Changes take ~5 minutes to become active. Don't test immediately after creating blocklists.
Cache Analysis Results: If moderating similar content repeatedly (templates, canned responses), cache results to reduce API calls and costs.
Log All Moderation Decisions: Maintain audit trails with timestamps, user IDs, content snippets (hashed for privacy), categories flagged, and actions taken. Critical for compliance and appeals.
Implement Human Review Workflows: Don't auto-ban users on single moderate-severity flags. Queue for review when severity is borderline (4-5).
Test with Real User Content: ML models trained on general datasets may not capture your community's norms. Test with real examples and refine blocklists accordingly.
When to Use / When NOT to Use
Use this skill when:
- You're building Python applications requiring content moderation
- You need multi-category harm detection without custom ML models
- You're moderating user-generated content (comments, posts, images)
- You need graduated moderation policies based on severity
- Compliance requires automated harmful content filtering
- You want to supplement ML with custom blocklists
- You're building chatbots, social platforms, or UGC systems
Avoid this skill when:
- You're working in Java or .NET (use language-specific SDKs)
- You need sub-100ms real-time latency (API calls add overhead)
- Your content language isn't well-supported by Azure's models
- You require on-premise deployment
- You need context-aware moderation beyond pattern matching
- You're moderating non-text, non-image content (audio, video transcripts need preprocessing)
Related Skills
- azure-ai-contentsafety-java: Java version of this SDK
- azure-ai-textanalytics-py: Broader text analytics including sentiment analysis
- azure-ai-vision-imageanalysis-py: General image analysis and computer vision
Source
Maintained by Microsoft. View on GitHub