调用云知声(UniSound)语音识别服务转写音频文件。支持多种音频格式,适用于金融、客服等场景。
Transcribe audio files using UniCloud ASR service. Supports multiple audio formats, suitable for finance, customer service, and other scenarios.
Use UniSound ASR for:
Do NOT use for:
Use when: The user needs to transcribe recorded audio files, or asks for UniSound/云知声 audio file transcription.
不适用于: 实时语音识别、语音合成(TTS)或直播字幕。
Install Python dependencies before using this skill. From the skill directory (skills/asr-file-transfer-tools):
pip install -r scripts/requirements.txt
Requires Python 3.8+.
⛔ MANDATORY RESTRICTIONS - DO NOT VIOLATE ⛔
python3 scripts/transcribe.pyIf the script execution fails (API not configured, network error, etc.):
```bash
python3 scripts/transcribe.py /path/to/audio.wav
```
Command options:
--format FORMAT - Audio format (wav, mp3, m4a, flac, ogg)--domain DOMAIN - Recognition domain (finance, customer_service, other)--out FILE - Save output to file instead of stdout--json - Output JSON format with full result--userid ID - Custom user ID--out: Transcript saved to specified file--json: Full JSON result with metadataText Format:
JSON Format:
Example 1: Quick Transcription
python3 scripts/transcribe.py recording.wav
Output: Transcript text printed to console
Example 2: Save to File
python3 scripts/transcribe.py interview.mp3 --format mp3 --out transcript.txt
Output: Transcript saved to transcript.txt
Example 3: JSON Output with Metadata
python3 scripts/transcribe.py audio.m4a --json --out result.json
Output: Complete JSON result with timestamps and confidence scores
Example 4: Domain-Specific Transcription
python3 scripts/transcribe.py financial_call.wav --domain finance
Output: Transcript optimized for financial terminology
The script uses the UniCloud ASR API with the following workflow:
> Privacy: Audio files are uploaded directly to UniCloud servers. No data is sent to third-party services.
>
> 隐私说明:音频文件直接上传到云知声服务器。不会将数据发送到第三方服务。
Supported file types:
Limits:
Use the --format flag to specify the format if auto-detection fails:
python3 scripts/transcribe.py audio.mp3 --format mp3
When API is not configured:
The error will show:
CONFIG_ERROR: UNISOUND_APPKEY or UNISOUND_SECRET not configured.
To use this skill, you need API credentials from UniCloud (云知声):
您需要从云知声获取 API 凭据:
联系云知声获取您的 API 凭据
您将收到:
For testing and evaluation only (用于测试和评估):
AppKey: 681e01d78d8a40e8928bc8268020639b
Secret: d7b2980cb61843d69fdab5e99deafcdf
UserId: unisound-python-demo
Base URL: http://af-asr.uat.hivoice.cn
> ⚠️ Important Security Notice / 重要安全提示
>
> - Test environment only — These credentials are for UAT testing only
> - 仅测试环境 — 这些凭据仅用于 UAT 测试
> - No sensitive data — Never use with production or sensitive audio files
> - 勿用于敏感数据 — 切勿用于生产或敏感音频文件
> - Get your own credentials — For production use, contact UniCloud
> - 获取自己的凭据 — 生产环境请联系云知声
> - Data privacy — Audio files are uploaded to UniSound servers
> - 数据隐私 — 音频文件将上传至云知声服务器
Guide the user to configure securely:
.env fileRequired environment variables:
| Variable | Required | Description | Default |
|---|---|---|---|
| ---------- | ---------- | ------------- | --------- |
UNISOUND_APPKEY | Yes | Application key / 应用密钥 | - |
UNISOUND_SECRET | Yes | Secret key / 认证密钥 | - |
UNISOUND_USERID | No | User identifier / 用户标识 | unisound-python-demo |
UNISOUND_BASE_URL | No | API base URL / API 基础地址 | http://af-asr.uat.hivoice.cn |
UNISOUND_DOMAIN | No | Recognition domain / 识别领域 | other |
UNISOUND_AUDIOTYPE | No | Default audio format / 默认音频格式 | wav |
Configuration examples:
Linux/macOS:
export UNISOUND_APPKEY="681e01d78d8a40e8928bc8268020639b"
export UNISOUND_SECRET="d7b2980cb61843d69fdab5e99deafcdf"
export UNISOUND_USERID="unisound-python-demo"
Windows (PowerShell):
$env:UNISOUND_APPKEY="681e01d78d8a40e8928bc8268020639b"
$env:UNISOUND_SECRET="d7b2980cb61843d69fdab5e99deafcdf"
$env:UNISOUND_USERID="unisound-python-demo"
Using .env file (Recommended):
UNISOUND_APPKEY=681e01d78d8a40e8928bc8268020639b
UNISOUND_SECRET=d7b2980cb61843d69fdab5e99deafcdf
UNISOUND_USERID=unisound-python-demo
> Security Note: Never commit .env files or actual credentials to version control.
> 安全提示:切勿将 .env 文件或实际凭据提交到版本控制系统。
Authentication failed:
API returned error: 401
→ AppKey or Secret is invalid, reconfigure with correct credentials
→ AppKey 或 Secret 无效,请重新配置正确的凭据
Network error:
Connection timeout
→ Check network connectivity to UniCloud API
→ 检查到云知声 API 的网络连接
Audio file not found:
错误: 音频文件不存在
→ Check the file path, use absolute path if needed
→ 检查文件路径,必要时使用绝对路径
Transcription timeout:
转写超时
→ Transcription is taking longer than expected (server may be busy)
→ 转写时间过长(服务器可能繁忙)
→ Try again later / 稍后重试
→ Check if the audio file is too large / 检查音频文件是否过大
Unsupported audio format:
Unsupported audio format
→ The audio format is not supported by the API
→ API 不支持该音频格式
→ Convert to a supported format (WAV recommended) / 转换为支持的格式(推荐 WAV)
→ Use --format flag to explicitly specify the format / 使用 --format 参数显式指定格式
# Convert using ffmpeg / 使用 ffmpeg 转换
ffmpeg -i input.mp3 -ar 16000 -ac 1 output.wav
API quota exceeded:
API returned error: 429
→ Too many requests, wait before retrying
→ 请求过多,请稍后重试
需要网络连接到云知声 ASR API
云端处理——音频文件会上传到云知声服务器
文件大小限制:最长 2 小时,最大 100MB
--domain for better accuracy 领域优化:使用适当的 --domain 以获得更高的准确率
测试凭据:UAT 环境凭据仅供测试使用
For production deployment / 生产部署:
从云知声获取您自己的凭据
使用环境变量——切勿在脚本或配置文件中嵌入生产凭据
审查隐私政策——音频文件会上传到云知声服务器;请查看其隐私政策
首先使用非敏感数据进行测试——始终先使用非敏感音频文件进行测试
Issue: Script fails with import error
→ Ensure dependencies are installed: pip install -r scripts/requirements.txt
→ Ensure using Python 3.8 or later / 确保使用 Python 3.8 或更高版本
Issue: Cannot connect to API server
无法连接到 API 服务器
→ Check network connectivity / 检查网络连接
→ Verify API endpoint URL is correct / 验证 API 端点 URL 是否正确
→ Try using a different network / 尝试使用其他网络
Issue: Poor transcription quality
→ Check audio quality (background noise, clarity) / 检查音频质量(背景噪音、清晰度)
→ Try using appropriate --domain parameter / 尝试使用适当的 --domain 参数
→ Ensure audio format is correct / 确保音频格式正确
If you encounter issues not covered here:
如果遇到未涵盖的问题:
查看云知声 ASR 文档了解最新的 API 变更
验证到 API 服务器的网络连接
检查错误消息详情以获取特定错误代码
确保使用 Python 3.8 或更高版本
# Check Python version / 检查 Python 版本
python3 --version
> Note: API capabilities and supported formats are determined by your UniCloud ASR API service configuration.
> 注意:API 功能和支持的格式由您的云知声 ASR API 服务配置决定。
共 2 个版本