Name: alicloud-ai-audio-asr
Availability: InStock

Category: provider Model Studio Qwen ASR (Non-Realtime) Validation mkdir -p output/alicloud-ai-audio-asr python -m py_compile skills/ai/audio/alicloud-ai-audio-asr/scripts/transcribe_audio.py && echo "py_compile_ok"

output/alicloud-ai-audio-asr/validate.txt Pass criteria: command exits 0 and output/alicloud-ai-audio-asr/validate.txt is generated. Output And Evidence Store transcripts and API responses under output/alicloud-ai-audio-asr/ . Keep one command log or sample response per run. Use Qwen ASR for recorded audio transcription (non-realtime), including short audio sync calls and long audio async jobs. Critical model names Use one of these exact model strings: qwen3-asr-flash qwen-audio-asr qwen3-asr-flash-filetrans Selection guidance: Use qwen3-asr-flash or qwen-audio-asr for short/normal recordings (sync). Use qwen3-asr-flash-filetrans for long-file transcription (async task workflow). Prerequisites Install SDK dependencies (script uses Python stdlib only): python3 -m venv .venv . .venv/bin/activate Set DASHSCOPE_API_KEY in environment, or add dashscope_api_key to ~/.alibabacloud/credentials . Normalized interface (asr.transcribe) Request audio (string, required): public URL or local file path. model (string, optional): default qwen3-asr-flash . language_hints (array, optional): e.g. zh , en . sample_rate (number, optional) vocabulary_id (string, optional) disfluency_removal_enabled (bool, optional) timestamp_granularities (array, optional): e.g. sentence . async (bool, optional): default false for sync models, true for qwen3-asr-flash-filetrans . Response text (string): normalized transcript text. task_id (string, optional): present for async submission. status (string): SUCCEEDED or submission status. raw (object): original API response. Quick start (official HTTP API) Sync transcription (OpenAI-compatible protocol): curl -sS --location 'https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions' \ --header "Authorization: Bearer $DASHSCOPE_API_KEY " \ --header 'Content-Type: application/json' \ --data '{ "model": "qwen3-asr-flash", "messages": [ { "role": "user", "content": [ { "type": "input_audio", "input_audio": { "data": "https://dashscope.oss-cn-beijing.aliyuncs.com/audios/welcome.mp3" } } ] } ], "stream": false, "asr_options": { "enable_itn": false } }' Async long-file transcription (DashScope protocol): curl -sS --location 'https://dashscope.aliyuncs.com/api/v1/services/audio/asr/transcription' \ --header "Authorization: Bearer $DASHSCOPE_API_KEY " \ --header 'X-DashScope-Async: enable' \ --header 'Content-Type: application/json' \ --data '{ "model": "qwen3-asr-flash-filetrans", "input": { "file_url": "https://dashscope.oss-cn-beijing.aliyuncs.com/audios/welcome.mp3" } }' Poll task result: curl -sS --location "https://dashscope.aliyuncs.com/api/v1/tasks/" \ --header "Authorization: Bearer $DASHSCOPE_API_KEY " Local helper script Use the bundled script for URL/local-file input and optional async polling: python skills/ai/audio/alicloud-ai-audio-asr/scripts/transcribe_audio.py \ --audio "https://dashscope.oss-cn-beijing.aliyuncs.com/audios/welcome.mp3" \ --model qwen3-asr-flash \ --language-hints zh,en \ --print-response Long-file mode: python skills/ai/audio/alicloud-ai-audio-asr/scripts/transcribe_audio.py \ --audio "https://dashscope.oss-cn-beijing.aliyuncs.com/audios/welcome.mp3" \ --model qwen3-asr-flash-filetrans \ --async \ --wait Operational guidance For local files, use input_audio.data (data URI) when direct URL is unavailable. Keep language_hints minimal to reduce recognition ambiguity. For async tasks, use 5-20s polling interval with max retry guard. Save normalized outputs under output/alicloud-ai-audio-asr/transcripts/ . Output location Default output: output/alicloud-ai-audio-asr/transcripts/ Override base dir with OUTPUT_DIR . Workflow Confirm user intent, region, identifiers, and whether the operation is read-only or mutating. Run one minimal read-only query first to verify connectivity and permissions. Execute the target operation with explicit parameters and bounded scope. Verify results and save output/evidence files. References references/api_reference.md references/sources.md Realtime synthesis is provided by skills/ai/audio/alicloud-ai-audio-tts-realtime/ .

安装