Azure AI Content Understanding SDK for Python Multimodal AI service that extracts semantic content from documents, video, audio, and image files for RAG and automated workflows. Installation pip install azure-ai-contentunderstanding Environment Variables CONTENTUNDERSTANDING_ENDPOINT = https:// < resource

.cognitiveservices.azure.com/ Authentication import os from azure . ai . contentunderstanding import ContentUnderstandingClient from azure . identity import DefaultAzureCredential endpoint = os . environ [ "CONTENTUNDERSTANDING_ENDPOINT" ] credential = DefaultAzureCredential ( ) client = ContentUnderstandingClient ( endpoint = endpoint , credential = credential ) Core Workflow Content Understanding operations are asynchronous long-running operations: Begin Analysis — Start the analysis operation with begin_analyze() (returns a poller) Poll for Results — Poll until analysis completes (SDK handles this with .result() ) Process Results — Extract structured results from AnalyzeResult.contents Prebuilt Analyzers Analyzer Content Type Purpose prebuilt-documentSearch Documents Extract markdown for RAG applications prebuilt-imageSearch Images Extract content from images prebuilt-audioSearch Audio Transcribe audio with timing prebuilt-videoSearch Video Extract frames, transcripts, summaries prebuilt-invoice Documents Extract invoice fields Analyze Document import os from azure . ai . contentunderstanding import ContentUnderstandingClient from azure . ai . contentunderstanding . models import AnalyzeInput from azure . identity import DefaultAzureCredential endpoint = os . environ [ "CONTENTUNDERSTANDING_ENDPOINT" ] client = ContentUnderstandingClient ( endpoint = endpoint , credential = DefaultAzureCredential ( ) )

Analyze document from URL

poller

client . begin_analyze ( analyzer_id = "prebuilt-documentSearch" , inputs = [ AnalyzeInput ( url = "https://example.com/document.pdf" ) ] ) result = poller . result ( )

Access markdown content (contents is a list)

content

result . contents [ 0 ] print ( content . markdown ) Access Document Content Details from azure . ai . contentunderstanding . models import MediaContentKind , DocumentContent content = result . contents [ 0 ] if content . kind == MediaContentKind . DOCUMENT : document_content : DocumentContent = content

type: ignore

print ( document_content . start_page_number ) Analyze Image from azure . ai . contentunderstanding . models import AnalyzeInput poller = client . begin_analyze ( analyzer_id = "prebuilt-imageSearch" , inputs = [ AnalyzeInput ( url = "https://example.com/image.jpg" ) ] ) result = poller . result ( ) content = result . contents [ 0 ] print ( content . markdown ) Analyze Video from azure . ai . contentunderstanding . models import AnalyzeInput poller = client . begin_analyze ( analyzer_id = "prebuilt-videoSearch" , inputs = [ AnalyzeInput ( url = "https://example.com/video.mp4" ) ] ) result = poller . result ( )

Access video content (AudioVisualContent)

content

result . contents [ 0 ]

Get transcript phrases with timing

for phrase in content . transcript_phrases : print ( f"[ { phrase . start_time } - { phrase . end_time } ]: { phrase . text } " )

Get key frames (for video)

for frame in content . key_frames : print ( f"Frame at { frame . time } : { frame . description } " ) Analyze Audio from azure . ai . contentunderstanding . models import AnalyzeInput poller = client . begin_analyze ( analyzer_id = "prebuilt-audioSearch" , inputs = [ AnalyzeInput ( url = "https://example.com/audio.mp3" ) ] ) result = poller . result ( )

Access audio transcript

content

result . contents [ 0 ] for phrase in content . transcript_phrases : print ( f"[ { phrase . start_time } ] { phrase . text } " ) Custom Analyzers Create custom analyzers with field schemas for specialized extraction:

Create custom analyzer

analyzer

client . create_analyzer ( analyzer_id = "my-invoice-analyzer" , analyzer = { "description" : "Custom invoice analyzer" , "base_analyzer_id" : "prebuilt-documentSearch" , "field_schema" : { "fields" : { "vendor_name" : { "type" : "string" } , "invoice_total" : { "type" : "number" } , "line_items" : { "type" : "array" , "items" : { "type" : "object" , "properties" : { "description" : { "type" : "string" } , "amount" : { "type" : "number" } } } } } } } )

Use custom analyzer

from azure . ai . contentunderstanding . models import AnalyzeInput poller = client . begin_analyze ( analyzer_id = "my-invoice-analyzer" , inputs = [ AnalyzeInput ( url = "https://example.com/invoice.pdf" ) ] ) result = poller . result ( )

Access extracted fields

print ( result . fields [ "vendor_name" ] ) print ( result . fields [ "invoice_total" ] ) Analyzer Management

List all analyzers

analyzers

client . list_analyzers ( ) for analyzer in analyzers : print ( f" { analyzer . analyzer_id } : { analyzer . description } " )

Get specific analyzer

analyzer

client . get_analyzer ( "prebuilt-documentSearch" )

Delete custom analyzer

client . delete_analyzer ( "my-custom-analyzer" ) Async Client import asyncio import os from azure . ai . contentunderstanding . aio import ContentUnderstandingClient from azure . ai . contentunderstanding . models import AnalyzeInput from azure . identity . aio import DefaultAzureCredential async def analyze_document ( ) : endpoint = os . environ [ "CONTENTUNDERSTANDING_ENDPOINT" ] credential = DefaultAzureCredential ( ) async with ContentUnderstandingClient ( endpoint = endpoint , credential = credential ) as client : poller = await client . begin_analyze ( analyzer_id = "prebuilt-documentSearch" , inputs = [ AnalyzeInput ( url = "https://example.com/doc.pdf" ) ] ) result = await poller . result ( ) content = result . contents [ 0 ] return content . markdown asyncio . run ( analyze_document ( ) ) Content Types Class For Provides DocumentContent PDF, images, Office docs Pages, tables, figures, paragraphs AudioVisualContent Audio, video files Transcript phrases, timing, key frames Both derive from MediaContent which provides basic info and markdown representation. Model Imports from azure . ai . contentunderstanding . models import ( AnalyzeInput , AnalyzeResult , MediaContentKind , DocumentContent , AudioVisualContent , ) Client Types Client Purpose ContentUnderstandingClient Sync client for all operations ContentUnderstandingClient (aio) Async client for all operations Best Practices Use begin_analyze with AnalyzeInput — this is the correct method signature Access results via result.contents[0] — results are returned as a list Use prebuilt analyzers for common scenarios (document/image/audio/video search) Create custom analyzers only for domain-specific field extraction Use async client for high-throughput scenarios with azure.identity.aio credentials Handle long-running operations — video/audio analysis can take minutes Use URL sources when possible to avoid upload overhead When to Use This skill is applicable to execute the workflow or actions described in the overview.

azure-ai-contentunderstanding-py

安装

Analyze document from URL

poller

Access markdown content (contents is a list)

content

type: ignore

Access video content (AudioVisualContent)

content

Get transcript phrases with timing

Get key frames (for video)

Access audio transcript

content

Create custom analyzer

analyzer

Use custom analyzer

Access extracted fields

List all analyzers

analyzers

Get specific analyzer

analyzer

Delete custom analyzer