← 返回
未分类 中文

Whisper Piper Voice

Set up and run a local voice pipeline combining Whisper STT (speech-to-text) and Piper TTS (text-to-speech) as a single HTTP server. Use when asked to set up...
在本地搭建并运行一个语音管道,将 Whisper STT(语音转文本)与 Piper TTS(文本转语音)合并为一个 HTTP 服务器,用于响应本地语音服务的搭建需求。
danielgrobelny
未分类 clawhub v1.0.0 1 版本 100000 Key: 无需
★ 0
Stars
📥 343
下载
💾 0
安装
1
版本
#latest

概述

Whisper + Piper Voice Pipeline

Local STT (speech-to-text) and TTS (text-to-speech) as a single HTTP server. Zero cloud dependencies.

Architecture

Audio In → POST /transcribe → Whisper (faster-whisper) → JSON {text, language}
Text In  → POST /speak       → Piper TTS → ffmpeg → audio/ogg (Opus)

Both endpoints run in one Python process on one port (default: 9998).

Quick Start

  1. Install dependencies:
  2. python3 -m venv ~/whisper-env && source ~/whisper-env/bin/activate
    pip install faster-whisper
    apt install ffmpeg  # or brew install ffmpeg on macOS
    
  1. Download Piper + a voice:
  2. mkdir -p ~/piper && cd ~/piper
    wget https://github.com/rhasspy/piper/releases/latest/download/piper_linux_x86_64.tar.gz
    tar xzf piper_linux_x86_64.tar.gz
    mkdir voices && cd voices
    wget https://huggingface.co/rhasspy/piper-voices/resolve/main/de/de_DE/thorsten_emotional/medium/de_DE-thorsten_emotional-medium.onnx
    wget https://huggingface.co/rhasspy/piper-voices/resolve/main/de/de_DE/thorsten_emotional/medium/de_DE-thorsten_emotional-medium.onnx.json
    
  1. Run the server (scripts/voice-server.py):
  2. python3 voice-server.py --port 9998 \
      --whisper-model small --whisper-device cpu \
      --piper-bin ~/piper/piper/piper \
      --piper-model ~/piper/voices/de_DE-thorsten_emotional-medium.onnx
    

API

Transcribe (audio → text):

curl -X POST -F "file=@message.ogg" http://HOST:9998/transcribe
# {"text": "Hallo Welt", "language": "de"}

Speak (text → audio):

curl -X POST -H "Content-Type: application/json" \
  -d '{"text": "Hallo Welt", "speaker": "4"}' \
  http://HOST:9998/speak -o response.ogg

Configuration

FlagDefaultDescription
----------------------------
--port9998Server port
--whisper-modelsmalltiny/base/small/medium/large-v3
--whisper-devicecpucpu or cuda
--piper-bin(required)Path to piper binary
--piper-model(required)Path to .onnx voice file
--piper-speaker4Speaker ID (multi-speaker models)
--speed0.9TTS speed (lower = faster)

Choosing Models

Whisper: small for CPU (good balance), medium for GPU (best quality without large-v3 overhead).

Piper voices: Browse https://rhasspy.github.io/piper-samples/ — download .onnx + .onnx.json files.

Full Setup Guide

Read references/setup-guide.md for systemd service config, all voice options, model comparison table, and OpenClaw integration details.

版本历史

共 1 个版本

  • v1.0.0 当前
    2026-05-07 12:03 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

Ci Failure Fixer

danielgrobelny
监控 GitHub Actions CI 流水线,检测失败并自动修复常见问题。适用于要求监控 CI、修复构建失败、监视 GitHub Actions 等场景。
★ 0 📥 376

Revealjs Presentations

danielgrobelny
创建、编辑并部署 reveal.js 演示文稿为单个 HTML 文件,可选自定义 CSS。适用于需要制作演示文稿、幻灯片或宣传材料时使用。
★ 1 📥 674

Vercel Staging Workflow

danielgrobelny
使用 GitHub Actions 与稳定的 URL 别名,为 Vercel 项目配置暂存/生产工作流。需创建暂存环境时使用。
★ 0 📥 380