← 返回
效率工具 中文

Willow Inference Server

Local ASR and TTS inference server. Use when the user wants to transcribe audio to text (ASR) or convert text to speech (TTS). Requires a running Willow Infe...
本地ASR和TTS推理服务器,用于将音频转写为文字(ASR)或将文字转为语音(TTS),需要运行Willow推理服务。
deantiwang
效率工具 clawhub v1.0.1 2 版本 100000 Key: 无需
★ 0
Stars
📥 617
下载
💾 8
安装
2
版本
#latest

概述

Willow Inference Server Skill

Local ASR (speech-to-text) and TTS (text-to-speech) inference server.

Setup

1. Start Willow Inference Server

git clone https://github.com/toverainc/willow-inference-server.git
cd willow-inference-server
./utils.sh install
./utils.sh gen-cert your-hostname
./utils.sh run

Server runs at https://your-hostname:19000

2. Configure Environment

Set the server URL:

export WILLOW_BASE_URL="https://your-hostname:19000"

Or configure per request (see below).

ASR (Speech-to-Text)

Transcribe Audio File

curl -X POST "${WILLOW_BASE_URL}/api/asr" \
  -F "audio_file=@/path/to/audio.m4a" \
  -F "language=auto"

Parameters

ParameterDescriptionDefault
---------------------------------
audio_fileAudio file to transcriberequired
languageLanguage code (en, zh, etc.) or "auto"auto
modelWhisper model (tiny, base, medium, large-v2)server config
tasktranscribe or translatetranscribe

Supported Formats

  • MP3, WAV, M4A, OGG, FLAC, WebM

Example: Transcribe with curl

# Basic transcription
curl -X POST "${WILLOW_BASE_URL}/asr" \
  -F "audio_file=@recording.m4a" \
  -F "language=zh"

# With specific model
curl -X POST "${WILLOW_BASE_URL}/asr" \
  -F "audio_file=@meeting.mp3" \
  -F "language=en" \
  -F "model=base"

TTS (Text-to-Speech)

Convert Text to Speech

curl -X POST "${WILLOW_BASE_URL}/tts" \
  -H "Content-Type: application/json" \
  -d '{"text": "Hello world", "voice": "af_sarah"}'

Parameters

ParameterDescriptionDefault
---------------------------------
textText to convert to speechrequired
voiceVoice ID (see below)default voice
speedSpeech speed (0.5-2.0)1.0
volumeVolume (0.0-1.0)1.0

Available Voices

Common voices (format: gender_voicename):

  • af_sarah - Sarah (Female)
  • af_bella - Bella (Female)
  • am_michael - Michael (Male)
  • am_alex - Alex (Male)

Check server docs for full list: ${WILLOW_BASE_URL}/api/docs

Example: TTS with curl

# Basic TTS
curl -X POST "${WILLOW_BASE_URL}/tts" \
  -H "Content-Type: application/json" \
  -d '{"text": "你好,这是测试"}' \
  -o output.wav

# With custom voice
curl -X POST "${WILLOW_BASE_URL}/tts" \
  -H "Content-Type: application/json" \
  -d '{"text": "Hello!", "voice": "am_michael", "speed": 1.2}' \
  -o hello.mp3

Environment Variables

VariableDescriptionDefault
--------------------------------
WILLOW_BASE_URLServer URLhttps://localhost:19000

Workflow Examples

1. Record and Transcribe

# Record audio (macOS)
rec test.wav

# Transcribe
curl -X POST "${WILLOW_BASE_URL}/asr" \
  -F "audio_file=@test.wav" \
  -F "language=auto"

2. Text to Speech

# Convert text to speech
curl -X POST "${WILLOW_BASE_URL}/tts" \
  -H "Content-Type: application/json" \
  -d '{"text": "今天的任务是学习新技能"}' \
  -o speech.wav

3. Batch Transcription

for f in *.m4a; do
  curl -X POST "${WILLOW_BASE_URL}/asr" \
    -F "audio_file=@$f" \
    -F "language=auto" \
    -o "${f%.m4a}.txt"
done

API Documentation

Full API docs available at: ${WILLOW_BASE_URL}/api/docs

Notes

  • All endpoints require HTTPS (or HTTP if configured)
  • Audio files are processed locally on the server
  • ASR latency depends on model size and hardware
  • TTS voices can be customized with custom voice recordings

版本历史

共 2 个版本

  • v1.0.1 当前
    2026-03-31 15:15 安全 安全
  • v1.0.0
    2026-03-26 22:17

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

productivity

Nano Pdf

steipete
使用nano-pdf CLI通过自然语言指令编辑PDF
★ 275 📥 114,962
productivity

Word / DOCX

ivangdavila
创建、检查和编辑 Microsoft Word 文档及 DOCX 文件,支持样式、编号、修订记录、表格、分节符及兼容性检查等功能。
★ 440 📥 148,116
productivity

Weather

steipete
获取当前天气和预报(无需API密钥)
★ 446 📥 226,454