Use this skill when AudioClaw already has the final reply text and now needs a voice version that can change on demand.
Common triggers:
calm, warm, cheerful, serious, or promo.voice_id or voice_family in a single request without editing code.voice_id such as vc-... and wants the runtime to use it directly without falling back first.Do not use this skill for ASR intake or long-form digest generation.
textscenevoice_idvoice_familyemotionspeed, pitch, volumescripts/openclaw_voice_switchboard.py.scripts/picoclaw_voice_reply.py.voice_idvoice_family plus matching emotion variantvoice_id is a clone-style id such as vc-..., this skill now tries that id directly first, even if it is not part of the built-in official voice catalog.preference_key is present, the script can remember:reply_modevoice_idemotionscene--out inside the AudioClaw workspace--openclaw-workspace-root pointing at the workspace root--delivery-profile feishu_voice when the downstream channel prefers .ogg/.opus--chmod 644 if you want to be explicit, though this skill now defaults to 0644--openclaw-workspace-root is set and --out is omitted, this skill now writes to workspace/state/audio/ automaticallyscripts/picoclaw_voice_reply.py for AudioClaw on FeishuMEDIA:... line through the message toolsend_file when you intentionally opt out of direct Feishu sendingtrace_id--set-default-voice-id vc-...--register-clone-voice-id vc-...When AudioClaw needs to find a usable voice or confirm a voice_id, use this order:
python3 scripts/openclaw_voice_switchboard.py --list-voiceshttps://senseaudio.cn/docs/voice_apiAPI 音色服务说明voice_id is a free, VIP, or SVIP voicevoice_id, still let the runtime validate access at synthesis time, because account permissions can differ by key.Practical rule:
--list-voices is the first-stop runtime catalog.https://senseaudio.cn/docs/voice_api is the canonical fallback reference for official voice names, voice_id, and package tier notes.When this skill is used inside AudioClaw for Feishu or Lark voice replies:
scripts/picoclaw_voice_reply.py..ogg/.opus file to Feishu and send it as msg_type=audio.send_file tool for that audio unless you explicitly passed --skip-direct-send.message tool with the local path or the MEDIA:... reference. AudioClaw will send them as plain text.我已经用语音回复你了。 instead of leaving the turn empty.media_reference only as debug metadata or future AudioClaw compatibility data.This rule matters because this AudioClaw environment does not render MEDIA:... as media, and the generic send_file tool sends Feishu voice notes as plain files instead of audio messages. The reliable path here is direct Feishu upload plus msg_type=audio.
The official public TTS API exposes:
voice_setting.voice_idvoice_setting.speedvoice_setting.volvoice_setting.pitchaudio_setting.formataudio_setting.sample_ratestream=falsestream=trueImportant constraint:
emotion request field.voice_id when one exists, or by keeping the voice and shaping speed / pitch / vol.Authorization: Bearer API_KEY. The generated Public Key is not required by this skill.voice_id looks like a clone id such as vc-..., this skill now auto-routes TTS to SenseAudio-TTS-1.5 and records audio.model_used in the manifest.This skill now treats SENSEAUDIO_API_KEY as the default API key source again.
Runtime rules:
SENSEAUDIO_API_KEY as an AudioClaw login token such as v2.public..., the shared bootstrap will replace it with the real sk-... value from ~/.audioclaw/workspace/state/senseaudio_credentials.json before TTS starts.--api-key-env still works, but the default runtime path is SENSEAUDIO_API_KEY.If you need the exact same speaker timbre across many emotions, use a purchased multi-variant voice family or an authorized custom voice. Otherwise this skill will approximate the requested emotion with the best available voice and tuning.
Minimal JSON request:
{
"text": "我们已经收到你的需求,今天下午会把结果发给你。",
"scene": "assistant",
"emotion": "calm"
}
Full request:
{
"text": "新品今晚八点开售,现在下单还有首发赠品。",
"scene": "sales",
"voice_id": "male_0027_b",
"voice_family": "male_0027",
"emotion": "promo",
"speed": 1.08,
"pitch": 1,
"volume": 1.05,
"audio_format": "mp3",
"sample_rate": 32000,
"preference_key": "feishu:ou_xxx",
"reply_mode": "voice",
"allow_fallback": true,
"strict_voice": false,
"cache_dir": "/tmp/openclaw-voice-cache"
}
Clone voice example:
{
"text": "这是你的克隆音色回复测试。",
"voice_id": "vc-yxdCFUKyNLPexxJ66jaXWk",
"emotion": "calm",
"allow_fallback": false,
"strict_voice": true
}
Supported emotion presets:
neutralcalmwarmcheerfulseriouspromosadangryanalyticalSupported scene hints:
assistantcustomer_supportbriefingsalesmarketingnarrationeducationgamingwarningRecommended handoff:
scripts/openclaw_voice_switchboard.py with a request JSON.speed / pitch / volumetrace_idOperational rules:
trace_id.0644 so the AudioClaw sender can read them reliably.--openclaw-workspace-root is set and --out stays inside that root, expose delivery.openclaw_media_reference.--delivery-profile feishu_voice is enabled, synthesize with AudioClaw first and then transcode to ogg/opus with system ffmpeg or imageio-ffmpeg.ffmpeg. Install ffmpeg or run python3 -m pip install imageio-ffmpeg on the target machine.workspace/state/audio/, which this skill will now use automatically when --openclaw-workspace-root is given without --out.scripts/picoclaw_voice_reply.py now uses the local Feishu app credentials from ~/.audioclaw/config.json, uploads the audio through the official /open-apis/im/v1/files endpoint, and sends it as msg_type=audio.chat_id from the latest agent_main_feishu_direct_*.jsonl session log unless you pass --chat-id explicitly.List voices:
python3 scripts/openclaw_voice_switchboard.py --list-voices
Check the official voice catalog page:
https://senseaudio.cn/docs/voice_api
List emotion presets:
python3 scripts/openclaw_voice_switchboard.py --list-emotions
Enable permanent voice reply for one user:
python3 scripts/openclaw_voice_switchboard.py \
--preference-key "feishu:ou_xxx" \
--set-reply-mode voice \
--set-default-voice-id male_0004_a \
--set-default-emotion calm \
--set-default-scene assistant
Enable permanent cloned-voice reply for one user:
python3 scripts/openclaw_voice_switchboard.py \
--preference-key "feishu:ou_xxx" \
--set-reply-mode voice \
--set-default-voice-id vc-yxdCFUKyNLPexxJ66jaXWk \
--set-default-emotion calm \
--set-default-scene assistant
Register a prepared clone voice so the runtime can list and reuse it:
python3 scripts/openclaw_voice_switchboard.py \
--register-clone-voice-id vc-yxdCFUKyNLPexxJ66jaXWk \
--register-clone-name "我的克隆音色"
Show registered clone voices:
python3 scripts/openclaw_voice_switchboard.py --show-clone-voices
Show current voice preference:
python3 scripts/openclaw_voice_switchboard.py \
--preference-key "feishu:ou_xxx" \
--show-preference
Generate one AudioClaw turn from a JSON request:
python3 scripts/openclaw_voice_switchboard.py \
--request-file /path/to/request.json \
--out /tmp/openclaw_reply.mp3
Direct CLI example:
python3 scripts/openclaw_voice_switchboard.py \
--text "我们已经处理好了,稍后把结果发给你。" \
--scene assistant \
--emotion warm \
--preference-key "feishu:ou_xxx" \
--delivery-profile feishu_voice \
--openclaw-workspace-root ~/.audioclaw/workspace \
--out ~/.audioclaw/workspace/audioclaw_warm.ogg
AudioClaw Feishu one-step example:
python3 scripts/picoclaw_voice_reply.py \
--text "这是一次可以直接发给飞书的语音回复。" \
--scene assistant \
--emotion calm \
--workspace-root ~/.audioclaw/workspace
Only generate, do not send:
python3 scripts/picoclaw_voice_reply.py \
--text "这是一次只生成不直发的测试语音。" \
--scene assistant \
--emotion calm \
--workspace-root ~/.audioclaw/workspace \
--skip-direct-send
scripts/senseaudio_tts_client.pyhttps://api.senseaudio.cn/v1/t2a_v2references/openclaw_voice_switchboard.mdhttps://senseaudio.cn/docs/voice_apiscripts/openclaw_voice_switchboard.pyscripts/picoclaw_voice_reply.pyaudio messagescripts/feishu_audio_sender.py.ogg/.opus~/.audioclaw/config.json app credentials by default, infers the active chat, uploads the file, and sends msg_type=audioreferences/openclaw_voice_switchboard.md共 1 个版本