Text-to-speech conversion using UniSound's TTS WebSocket API for generating high-quality Chinese Mandarin audio from text.
使用云知声 TTS WebSocket API 进行文本转语音转换,生成高质量中文普通话音频。
Use UniSound TTS for:
Do NOT use for:
Use when: The user needs text-to-speech conversion, asks for "语音合成" (speech synthesis), or mentions UniSound/云知声 TTS.
Install Python dependencies before using this skill. From the skill directory (skills/tts-tools):
pip install websocket-client
Requires Python 3.6+.
⛔ MANDATORY RESTRICTIONS - DO NOT VIOLATE ⛔
python scripts/tts.pyIf the script execution fails (API not configured, network error, etc.):
```bash
export UNISOUND_APPKEY='ce44uxf7g5eag2cv33qvlp5d22qrkgcezvgfp2q3'
export UNISOUND_SECRET='5c12231cd279b35873a3ccecf9439118'
```
```bash
python scripts/tts.py --text '今天天气怎么样'
```
Command options:
--text TEXT - Text to convert to speech (default: '今天天气怎么样?')--voice VOICE - Voice name (default: xiaofeng-base)--format FORMAT - Output format: mp3, wav, pcm (default: mp3)--sample RATE - Sample rate: 8k, 16k, 24k (default: 24k)--speed SPEED - Speech speed 0-100 (default: 50)--volume VOLUME - Volume level 0-100 (default: 50)--pitch PITCH - Pitch level 0-100 (default: 50)--bright BRIGHT - Brightness/tone 0-100 (default: 50)--appkey APPKEY - Override appkey (default: UNISOUND_APPKEY env var)--secret SECRET - Override secret (default: UNISOUND_SECRET env var)results/ directory. 1234567890.mp3Audio Format Options:
Sample Rates:
Example 1: Quick Start with Test Credentials
# Set test credentials
export UNISOUND_APPKEY='ce44uxf7g5eag2cv33qvlp5d22qrkgcezvgfp2q3'
export UNISOUND_SECRET='5c12231cd279b35873a3ccecf9439118'
# Convert text to speech
python scripts/tts.py --text '你好世界'
Output: results/1234567890.mp3
Example 2: Custom Voice and Format
python scripts/tts.py --text '今天天气怎么样' --voice xiaofeng-base --format wav
Output: High-quality WAV file with male voice
Example 3: Adjusted Speech Parameters
python scripts/tts.py --text '快速朗读' --speed 70 --volume 60 --pitch 50
Output: Faster speech with increased volume
Example 4: High-Quality Audio Production
python scripts/tts.py --text '高质量音频' --format wav --sample 24k --volume 60
Output: Production-quality WAV file at 24kHz
Example 5: Command-line Credential Override
python scripts/tts.py \
--text '测试' \
--appkey 'ce44uxf7g5eag2cv33qvlp5d22qrkgcezvgfp2q3' \
--secret '5c12231cd279b35873a3ccecf9439118'
The script uses the UniSound TTS WebSocket API with the following workflow:
使用 SHA256 签名进行身份验证
wss://ws-stts.hivoice.cn/v1/tts建立 WebSocket 连接到云知声 TTS 服务
发送包含文本和语音参数的 TTS 请求
以二进制块形式接收流式音频数据
将音频文件保存到结果目录
| Voice | Type | Description |
|---|---|---|
| ------- | ------ | ------------- |
| xiaofeng-base | Male | Standard male voice, clear and natural |
| xiaoyan | Female | Female voice options |
| xiaomei | Female | Alternative female voice |
| Custom voices | Various | Contact UniSound for more options |
| Parameter | Range | Default | Description |
|---|---|---|---|
| ----------- | ------- | --------- | ------------- |
| speed | 0-100 | 50 | Speech speed (50 = normal, higher = faster) |
| volume | 0-100 | 50 | Volume level (50 = normal, higher = louder) |
| pitch | 0-100 | 50 | Pitch level (50 = normal, higher = higher) |
| bright | 0-100 | 50 | Brightness/tone (50 = normal) |
Recommended settings:
When credentials are not configured:
The script will show:
Error: AppKey and Secret are required!
Set them via --appkey/--secret arguments or UNISOUND_APPKEY/UNISOUND_SECRET environment variables.
For testing and evaluation, use these credentials:
用于测试和评估,请使用以下凭据:
export UNISOUND_APPKEY='ce44uxf7g5eag2cv33qvlp5d22qrkgcezvgfp2q3'
export UNISOUND_SECRET='5c12231cd279b35873a3ccecf9439118'
> ⚠️ Important Security Notice / 重要安全提示
>
> - Test credentials only — These are for testing and evaluation purposes
> - 仅测试凭据——这些凭据仅供测试和评估使用
> - No sensitive data — Never use with production or sensitive content
> - 勿用于敏感数据——切勿用于生产或敏感内容
> - Get your own credentials — For production use, contact UniSound
> - 获取自己的凭据——生产环境请联系云知声
> - Data privacy — Text is sent to UniSound servers for processing
> - 数据隐私——文本将发送至云知声服务器进行处理
For production use, obtain API credentials from UniSound (云知声):
用于生产环境时,请从云知声获取 API 凭据:
联系云知声获取您的 API 凭据
Visit: https://www.unisound.com/
您将收到:
Method 1: Environment Variables (Recommended)
Linux/macOS:
export UNISOUND_APPKEY='ce44uxf7g5eag2cv33qvlp5d22qrkgcezvgfp2q3'
export UNISOUND_SECRET='5c12231cd279b35873a3ccecf9439118'
python scripts/tts.py --text '你好'
Windows (PowerShell):
$env:UNISOUND_APPKEY='ce44uxf7g5eag2cv33qvlp5d22qrkgcezvgfp2q3'
$env:UNISOUND_SECRET='5c12231cd279b35873a3ccecf9439118'
python scripts/tts.py --text '你好'
Windows (CMD):
set UNISOUND_APPKEY=ce44uxf7g5eag2cv33qvlp5d22qrkgcezvgfp2q3
set UNISOUND_SECRET=5c12231cd279b35873a3ccecf9439118
python scripts/tts.py --text '你好'
Method 2: .env File (Recommended for Development)
Create a .env file in the project root:
UNISOUND_APPKEY=ce44uxf7g5eag2cv33qvlp5d22qrkgcezvgfp2q3
UNISOUND_SECRET=5c12231cd279b35873a3ccecf9439118
Then use with python-dotenv or load in your shell.
> Security Note: Never commit .env files or actual production credentials to version control.
> 安全提示:切勿将 .env 文件或实际生产凭据提交到版本控制系统。
Method 3: Command-Line Arguments
python scripts/tts.py \
--text '你好世界' \
--appkey 'ce44uxf7g5eag2cv33qvlp5d22qrkgcezvgfp2q3' \
--secret '5c12231cd279b35873a3ccecf9439118'
| Variable | Required | Description |
|---|---|---|
| ---------- | ---------- | ------------- |
UNISOUND_APPKEY | Yes | Application key / 应用密钥 |
UNISOUND_SECRET | Yes | Secret key / 认证密钥 |
Basic Python API:
import os
from scripts.tts import Ws_parms, do_ws, write_results
# Get credentials from environment variables
appkey = os.getenv('UNISOUND_APPKEY', 'ce44uxf7g5eag2cv33qvlp5d22qrkgcezvgfp2q3')
secret = os.getenv('UNISOUND_SECRET', '5c12231cd279b35873a3ccecf9439118')
# Configure TTS parameters
ws_parms = Ws_parms(
url='wss://ws-stts.hivoice.cn/v1/tts',
appkey=appkey,
secret=secret,
pid=1,
vcn='xiaofeng-base',
text='你好,欢迎使用云知声语音合成服务!',
tts_format='mp3',
tts_sample='24k',
user_id='my-app',
)
# Execute TTS conversion
do_ws(ws_parms)
# Save result to file
write_results(ws_parms)
print('Audio saved to results/ directory!')
Authentication failed:
Error: AppKey and Secret are required!
→ Credentials not provided
→ Set UNISOUND_APPKEY and UNISOUND_SECRET environment variables
→ 未提供凭据,请设置环境变量
WebSocket connection error:
WebSocket error: ...
→ Check network connectivity to UniSound API
→ Verify the API endpoint URL is correct
→ Check if firewall is blocking WebSocket connections
→ 检查网络连接和防火墙设置
No audio data received:
Error: No audio data received
→ Text may be empty or contain invalid characters
→ Check the text parameter is not empty
→ Verify text encoding is UTF-8
→ Credentials may be invalid
→ 检查文本内容、编码和凭据
Invalid speech parameter:
Error: speed must be between 0 and 100, got 150
→ Speech parameters must be between 0 and 100
→ 语音参数必须在 0 到 100 之间
WebSocket connection timeout:
WebSocket error: timeout
→ Network connection issue
→ API service may be temporarily unavailable
→ Check internet connection
→ 网络连接问题或服务暂时不可用
import os
from scripts.tts import Ws_parms, do_ws, write_results
appkey = os.getenv('UNISOUND_APPKEY', 'ce44uxf7g5eag2cv33qvlp5d22qrkgcezvgfp2q3')
secret = os.getenv('UNISOUND_SECRET', '5c12231cd279b35873a3ccecf9439118')
ws_parms = Ws_parms(
url='wss://ws-stts.hivoice.cn/v1/tts',
appkey=appkey,
secret=secret,
pid=1,
vcn='xiaofeng-base',
text='这是自定义参数的语音合成示例',
tts_format='wav',
tts_sample='24k',
user_id='demo',
)
# Customize speech parameters
ws_parms.tts_speed = 60 # Faster speech (0-100)
ws_parms.tts_volume = 70 # Louder volume (0-100)
ws_parms.tts_pitch = 40 # Lower pitch (0-100)
ws_parms.tts_bright = 60 # Brighter tone (0-100)
do_ws(ws_parms)
write_results(ws_parms)
import os
from scripts.tts import Ws_parms, do_ws, write_results
def batch_tts(text_list):
"""Convert multiple texts to audio files"""
appkey = os.getenv('UNISOUND_APPKEY', 'ce44uxf7g5eag2cv33qvlp5d22qrkgcezvgfp2q3')
secret = os.getenv('UNISOUND_SECRET', '5c12231cd279b35873a3ccecf9439118')
for i, text in enumerate(text_list):
ws_parms = Ws_parms(
url='wss://ws-stts.hivoice.cn/v1/tts',
appkey=appkey,
secret=secret,
pid=i,
vcn='xiaofeng-base',
text=text,
tts_format='mp3',
tts_sample='24k',
user_id=f'batch-{i}',
)
do_ws(ws_parms)
write_results(ws_parms)
print(f"Generated: {text[:30]}...")
# Usage
texts = [
"第一段文字",
"第二段文字",
"第三段文字"
]
batch_tts(texts)
import os
from scripts.tts import Ws_parms, do_ws, write_results
def convert_chapter(chapter_text, chapter_num, voice='xiaofeng-base'):
"""Convert a book chapter to audio file"""
# Add chapter announcement
intro = f"第{chapter_num}章。"
full_text = intro + chapter_text
appkey = os.getenv('UNISOUND_APPKEY', 'ce44uxf7g5eag2cv33qvlp5d22qrkgcezvgfp2q3')
secret = os.getenv('UNISOUND_SECRET', '5c12231cd279b35873a3ccecf9439118')
ws_parms = Ws_parms(
url='wss://ws-stts.hivoice.cn/v1/tts',
appkey=appkey,
secret=secret,
pid=chapter_num,
vcn=voice,
text=full_text,
tts_format='mp3',
tts_sample='24k',
user_id=f'audiobook-ch{chapter_num}',
)
# Slower, clearer reading for books
ws_parms.tts_speed = 45
ws_parms.tts_pitch = 50
do_ws(ws_parms)
write_results(ws_parms)
print(f"Chapter {chapter_num} converted")
# Usage
chapter = """这是第一章的内容。在一个阳光明媚的早晨,
主人公开始了他的冒险之旅。"""
convert_chapter(chapter, 1)
import os
from scripts.tts import Ws_parms, do_ws, write_results
def accessibility_reader(text, speed='normal', voice='xiaofeng-base'):
"""
Text-to-speech for accessibility (visually impaired users)
with customizable reading speed
"""
speed_map = {
'slow': 35,
'normal': 50,
'fast': 65
}
appkey = os.getenv('UNISOUND_APPKEY', 'ce44uxf7g5eag2cv33qvlp5d22qrkgcezvgfp2q3')
secret = os.getenv('UNISOUND_SECRET', '5c12231cd279b35873a3ccecf9439118')
ws_parms = Ws_parms(
url='wss://ws-stts.hivoice.cn/v1/tts',
appkey=appkey,
secret=secret,
pid=1,
vcn=voice,
text=text,
tts_format='mp3',
tts_sample='24k',
user_id='accessibility',
)
ws_parms.tts_speed = speed_map.get(speed, 50)
ws_parms.tts_volume = 70 # Higher volume for accessibility
do_ws(ws_parms)
write_results(ws_parms)
return ws_parms.tts_stream
# Usage
article = "这是一篇重要的新闻文章。"
accessibility_reader(article, speed='slow')
中文优化——简体中文文本效果最佳
需要稳定的网络连接进行 WebSocket 流式传输
results/ directory for output 音频文件保存在本地——输出文件在 results/ 目录
文本编码——确保文本为 UTF-8 编码
默认采样率为 24k——比标准 16k 质量更高
测试凭据——提供的凭据仅供测试和评估使用
测试使用——使用提供的测试凭据
生产环境——始终从云知声获取您自己的凭据
使用环境变量——安全地将凭据存储在环境变量中
切勿硬编码凭据——不要在代码中嵌入生产凭据
使用 .env 文件——用于本地开发(添加到 .gitignore)
定期轮换凭据——在生产环境中
Issue: Script fails with import error
→ Ensure dependencies are installed: pip install websocket-client
→ Ensure using Python 3.6 or later
→ 确保安装依赖并使用 Python 3.6 或更高版本
Issue: "AppKey and Secret are required!" error
→ Set UNISOUND_APPKEY and UNISOUND_SECRET environment variables
→ Or use --appkey and --secret command-line arguments
→ 设置环境变量或使用命令行参数
Issue: Poor audio quality
→ Try using WAV format with 24k sample rate
→ Adjust speech parameters for your use case
→ 尝试使用 WAV 格式和 24k 采样率
Issue: WebSocket connection timeout
→ Check network connectivity
→ Verify firewall allows WebSocket connections
→ Check if API service is operational
→ 检查网络连接和防火墙设置
Issue: Generated audio sounds unnatural
→ Adjust speed parameter (try 45-55 range)
→ Check text for proper punctuation
→ Consider breaking long sentences into shorter ones
→ 调整语速参数和文本标点
Issue: Test credentials stopped working
→ Test credentials may have expiration or rate limits
→ Contact UniSound to obtain your own credentials
→ 测试凭据可能已过期或达到速率限制
→ 请联系云知声获取您自己的凭据
有声读物:使用速度 45,添加章节说明
无障碍应用:使用速度 35-40,更高音量(70)
新闻播报:使用速度 55,更明亮的语调(60)
批量处理:在请求之间实现延迟
生产环境:添加错误处理和重试逻辑
最佳质量:使用 24k 采样率和 WAV 格式
Load these reference documents when:
The UniSound TTS API uses SHA256 signature-based authentication:
# Signature format (automatically generated by Ws_parms class)
# SHA256(appkey + timestamp + secret).upper()
# Manual signature example (if needed):
import hashlib
import time
def generate_signature(appkey, secret):
timestamp = str(int(time.time() * 1000))
hs = hashlib.sha256()
hs.update((appkey + timestamp + secret).encode('utf-8'))
signature = hs.hexdigest().upper()
return timestamp, signature
WebSocket URL format:
wss://ws-stts.hivoice.cn/v1/tts?time={timestamp}&appkey={appkey}&sign={signature}
> Note: API capabilities, available voices, and rate limits are determined by your UniSound TTS API service configuration and subscription plan.
> 注意:API 功能、可用语音和速率限制由您的云知声 TTS API 服务配置和订阅计划决定。
共 2 个版本