Create interactive language-learning audio with official SenseAudio TTS endpoints and parameters.
SENSEAUDIO_API_KEY.Authorization: Bearer .python3, requests, and pydub.pydub may also require a local audio backend such as ffmpeg; if unavailable, prefer writing individual audio files instead of merging them.Use the official SenseAudio TTS rules summarized below:
POST https://api.senseaudio.cn/v1/t2a_v2SenseAudio-TTS-1.010000 charactersvoice_setting.voice_id is requiredvoice_setting.speed range: 0.5-2.0mp3, wav, pcm, flac8000, 16000, 22050, 24000, 32000, 4410032000, 64000, 128000, 2560001 or 210000 character limit.model, text, stream, and voice_setting.voice_id.speed, pitch, vol, and audio_setting only when needed.data.audio; decode before saving.pydub and an audio backend are available, merge clips and insert silence.trace_id only for troubleshooting and avoid showing it unless needed.import binascii
import os
import requests
API_KEY = os.environ["SENSEAUDIO_API_KEY"]
API_URL = "https://api.senseaudio.cn/v1/t2a_v2"
def generate_tts(text, voice_id="male_0004_a", speed=1.0, stream=False):
payload = {
"model": "SenseAudio-TTS-1.0",
"text": text,
"stream": stream,
"voice_setting": {
"voice_id": voice_id,
"speed": speed,
},
"audio_setting": {
"format": "mp3",
"sample_rate": 32000,
"bitrate": 128000,
"channel": 2,
},
}
response = requests.post(
API_URL,
headers={
"Authorization": f"Bearer {API_KEY}",
"Content-Type": "application/json",
},
json=payload,
timeout=60,
)
response.raise_for_status()
data = response.json()
audio_hex = data["data"]["audio"]
return binascii.unhexlify(audio_hex), data.get("trace_id")
speed=0.81000-2000ms) between clipsvoice_id values for source and translation when helpfulpydub.AudioSegment; decode and load through a supported container format.共 2 个版本