← 返回
未分类 Key 中文

Aliyun Qwen Asr

Use when transcribing non-realtime speech with Alibaba Cloud Model Studio Qwen ASR models (`qwen3-asr-flash`, `qwen-audio-asr`, `qwen3-asr-flash-filetrans`)....
Use when transcribing non-realtime speech with Alibaba Cloud Model Studio Qwen ASR models (`qwen3-asr-flash`, `qwen-audio-asr`, `qwen3-asr-flash-filetrans`)....
cinience cinience 来源
未分类 clawhub v1.0.0 1 版本 100000 Key: 需要
★ 0
Stars
📥 308
下载
💾 0
安装
1
版本
#latest

概述

Category: provider

Model Studio Qwen ASR (Non-Realtime)

Validation

mkdir -p output/aliyun-qwen-asr
python -m py_compile skills/ai/audio/aliyun-qwen-asr/scripts/transcribe_audio.py && echo "py_compile_ok" > output/aliyun-qwen-asr/validate.txt

Pass criteria: command exits 0 and output/aliyun-qwen-asr/validate.txt is generated.

Output And Evidence

  • Store transcripts and API responses under output/aliyun-qwen-asr/.
  • Keep one command log or sample response per run.

Use Qwen ASR for recorded audio transcription (non-realtime), including short audio sync calls and long audio async jobs.

Critical model names

Use one of these exact model strings:

  • qwen3-asr-flash
  • qwen3-asr-flash-2026-02-10
  • qwen-audio-asr
  • qwen3-asr-flash-filetrans
  • qwen3-asr-flash-filetrans-2025-11-17

Selection guidance:

  • Use qwen3-asr-flash, qwen3-asr-flash-2026-02-10, or qwen-audio-asr for short/normal recordings (sync).
  • Use qwen3-asr-flash-filetrans or qwen3-asr-flash-filetrans-2025-11-17 for long-file transcription (async task workflow).

Prerequisites

  • Install SDK dependencies (script uses Python stdlib only):
python3 -m venv .venv
. .venv/bin/activate
  • Set DASHSCOPE_API_KEY in environment, or add dashscope_api_key to ~/.alibabacloud/credentials.

Normalized interface (asr.transcribe)

Request

  • audio (string, required): public URL or local file path.
  • model (string, optional): default qwen3-asr-flash.
  • language_hints (array, optional): e.g. zh, en.
  • sample_rate (number, optional)
  • vocabulary_id (string, optional)
  • disfluency_removal_enabled (bool, optional)
  • timestamp_granularities (array, optional): e.g. sentence.
  • async (bool, optional): default false for sync models, true for qwen3-asr-flash-filetrans.

Response

  • text (string): normalized transcript text.
  • task_id (string, optional): present for async submission.
  • status (string): SUCCEEDED or submission status.
  • raw (object): original API response.

Quick start (official HTTP API)

Sync transcription (OpenAI-compatible protocol):

curl -sS --location 'https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions' \
  --header "Authorization: Bearer $DASHSCOPE_API_KEY" \
  --header 'Content-Type: application/json' \
  --data '{
    "model": "qwen3-asr-flash",
    "messages": [
      {
        "role": "user",
        "content": [
          {
            "type": "input_audio",
            "input_audio": {
              "data": "https://dashscope.oss-cn-beijing.aliyuncs.com/audios/welcome.mp3"
            }
          }
        ]
      }
    ],
    "stream": false,
    "asr_options": {
      "enable_itn": false
    }
  }'

Async long-file transcription (DashScope protocol):

curl -sS --location 'https://dashscope.aliyuncs.com/api/v1/services/audio/asr/transcription' \
  --header "Authorization: Bearer $DASHSCOPE_API_KEY" \
  --header 'X-DashScope-Async: enable' \
  --header 'Content-Type: application/json' \
  --data '{
    "model": "qwen3-asr-flash-filetrans",
    "input": {
      "file_url": "https://dashscope.oss-cn-beijing.aliyuncs.com/audios/welcome.mp3"
    }
  }'

Poll task result:

curl -sS --location "https://dashscope.aliyuncs.com/api/v1/tasks/<task_id>" \
  --header "Authorization: Bearer $DASHSCOPE_API_KEY"

Local helper script

Use the bundled script for URL/local-file input and optional async polling:

python skills/ai/audio/aliyun-qwen-asr/scripts/transcribe_audio.py \
  --audio "https://dashscope.oss-cn-beijing.aliyuncs.com/audios/welcome.mp3" \
  --model qwen3-asr-flash \
  --language-hints zh,en \
  --print-response

Long-file mode:

python skills/ai/audio/aliyun-qwen-asr/scripts/transcribe_audio.py \
  --audio "https://dashscope.oss-cn-beijing.aliyuncs.com/audios/welcome.mp3" \
  --model qwen3-asr-flash-filetrans \
  --async \
  --wait

Operational guidance

  • For local files, use input_audio.data (data URI) when direct URL is unavailable.
  • Keep language_hints minimal to reduce recognition ambiguity.
  • For async tasks, use 5-20s polling interval with max retry guard.
  • Save normalized outputs under output/aliyun-qwen-asr/transcripts/.

Output location

  • Default output: output/aliyun-qwen-asr/transcripts/
  • Override base dir with OUTPUT_DIR.

Workflow

1) Confirm user intent, region, identifiers, and whether the operation is read-only or mutating.

2) Run one minimal read-only query first to verify connectivity and permissions.

3) Execute the target operation with explicit parameters and bounded scope.

4) Verify results and save output/evidence files.

References

  • references/api_reference.md
  • references/sources.md
  • Realtime synthesis is provided by skills/ai/audio/aliyun-qwen-tts-realtime/.

版本历史

共 1 个版本

  • v1.0.0 当前
    2026-05-07 15:10 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

it-ops-security

Alicloud Ai Content Aimiaobi

cinience
使用OpenAPI/SDK管理阿里云全秒(AIMiaoBi),在用户请求阿里云秒币内容操作(如列出资源)时使用。
★ 0 📥 1,942
design-media

UI/UX Pro Max

xobi667
提供 UI/UX 设计智能与实现指导,帮助打造精美界面。适用于 UI 设计、UX 流程、信息架构、视觉风格、设计系统/标记、组件规格、文案/微文案、无障碍及前端 UI(HTML/CSS/JS、React、Next.js、Vue、Svelte
★ 216 📥 47,338
design-media

Nano Banana Pro

steipete
使用 Nano Banana Pro (Gemini 3 Pro Image) 生成或编辑图像。支持文生图、图生图及 1K/2K/4K 分辨率,适用于图像创建、修改及编辑请求,使用 --input-image 指定输入图像。
★ 429 📥 116,784