← 返回
数据分析 中文

Speak Turbo - Talk to your Claude 90ms latency!

Give your agent the ability to speak to you real-time. Talk to your Claude! Ultra-fast TTS, text-to-speech, voice synthesis, audio output with ~90ms latency....
赋予Agent实时语音交互能力。与Claude对话!超快TTS文字转语音,约90毫秒低延迟音频输出。
emzod
数据分析 clawhub v1.0.7 1 版本 99906.1 Key: 无需
★ 0
Stars
📥 1,064
下载
💾 18
安装
1
版本
#ai-agent#apple-silicon#audio#claude-code#fast#latest#local#mlx#realistic#skills#speak#talk#text-to-speech#tts#voice

概述

speakturbo - Talk to your Claude!

Give your agent the ability to speak to you real-time. Ultra-fast text-to-speech with ~90ms latency and 8 built-in voices.

Quick Start

# Play immediately - you should hear "Hello world" through your speakers
speakturbo "Hello world"
# Output: ⚡ 92ms → ▶ 93ms → ✓ 1245ms

# Verify it's working by saving to file
speakturbo "Hello world" -o test.wav
ls -lh test.wav  # Should show ~50-100KB file

Output explained: = first audio received, = playback started, = done

First Run

The first execution takes 2-5 seconds while the daemon starts and loads the model into memory. Subsequent calls are ~90ms to first sound.

# First run (slow - daemon starting)
speakturbo "Starting up"  # ~2-5 seconds

# Second run (fast - daemon already running)
speakturbo "Now I'm fast"  # ~90ms

Usage

# Basic - plays immediately (default voice: alba)
speakturbo "Hello world"

# Save to file (no audio playback)
speakturbo "Hello" -o output.wav

# Save to specific file
speakturbo "Goodbye" -o goodbye.wav

# Quiet mode (suppress status messages, still plays audio)
speakturbo "Hello" -q

# List available voices
speakturbo --list-voices

Available Voices

VoiceType
-------------
albaFemale (default)
mariusMale
javertMale
jeanMale
fantineFemale
cosetteFemale
eponineFemale
azelmaFemale

Performance

MetricValue
---------------
Time to first sound~90ms (daemon warm)
First run2-5s (daemon startup)
Real-time factor~4x faster
Sample rate24kHz mono

Architecture

speakturbo (Rust CLI, 2.2MB)
    │
    │ HTTP streaming (port 7125)
    ▼
speakturbo-daemon (Python + pocket-tts)
    │
    │ Model in memory, auto-shutdown after 1hr idle
    ▼
Audio playback (rodio)

Text Input

  • Encoding: UTF-8
  • Quotes in text: Use escaping: speakturbo "She said \"hello\""
  • Long text: Supported, streams as it generates

Output Path Security

The -o flag only writes to directories that are on the allowlist. By default, these are:

  • /tmp and system temp directories
  • Your current working directory
  • ~/.speakturbo/

If you need to write elsewhere, use --allow-dir:

speakturbo "Hello" -o /custom/path/audio.wav --allow-dir /custom/path

To permanently allow a directory, add it to ~/.speakturbo/config:

mkdir -p ~/.speakturbo && echo "/custom/path" >> ~/.speakturbo/config

The config file is one directory per line. Lines starting with # are comments.

Exit Codes

CodeMeaning
---------------
0Success (audio played/saved)
1Error (daemon connection failed, invalid args)

When to Use

Use speakturbo when:

  • You need instant audio feedback (~90ms)
  • Speed matters more than voice variety
  • Built-in voices are sufficient

Use speak instead when:

  • You need custom voice cloning (Morgan Freeman, etc.)

speak "text" --voice ~/.chatter/voices/morgan_freeman.wav

  • You need emotion tags like [laugh], [sigh]
  • Quality/variety matters more than speed

See the speak skill documentation for full usage.

Troubleshooting

No audio plays:

# Check daemon is running
curl http://127.0.0.1:7125/health
# Expected: {"status":"ready","voices":["alba","marius",...]}

# Verify by saving to file and playing manually
speakturbo "test" -o /tmp/test.wav
afplay /tmp/test.wav  # macOS
aplay /tmp/test.wav   # Linux

Daemon won't start:

# Check port availability
lsof -i :7125

# Manually kill and restart
pkill -f "daemon_streaming"
speakturbo "test"  # Auto-restarts daemon

First run is slow:

This is expected. The daemon needs to load the ~100MB model into memory. Subsequent calls will be fast (~90ms).

Daemon Management

The daemon auto-starts on first use and auto-shuts down after 1 hour idle.

# Check status
curl http://127.0.0.1:7125/health

# Manual stop
pkill -f "daemon_streaming"

# View logs
cat /tmp/speakturbo.log

Comparison with speak

Featurespeakturbospeak
----------------------------
Time to first sound~90ms~4-8s
Voice cloning
Emotion tags
Voices8 built-inCustom wav files
Enginepocket-ttsChatterbox

版本历史

共 1 个版本

  • v1.0.7 当前
    2026-03-29 06:54 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

data-analysis

Stock Analysis

udiedrichsen
{"answer":"基于雅虎财经数据,分析股票与加密货币。支持投资组合管理、自选股预警、股息分析、8维评分、热门趋势扫描及传闻/早期信号探测。适用于股票分析、持仓追踪、财报异动、加密监控、热门股追踪或提前发掘非主流传闻。"}
★ 270 📥 57,044
data-analysis

Data Analysis

ivangdavila
{"answer":"数据分析与可视化。查询数据库、生成报告、自动化电子表格,将原始数据转化为清晰可行的见解。适用于:(1) 您……"}
★ 199 📥 65,290
data-analysis

Excel / XLSX

ivangdavila
创建、检查和编辑 Microsoft Excel 工作簿及 XLSX 文件,支持可靠的公式、日期、类型、格式、重算及模板保留功能。
★ 368 📥 140,931