← 返回
未分类

Voice To Text

Convert voice messages and audio files to text using Vosk offline speech recognition. Use when a user sends a voice message, audio file, or asks to transcrib...
使用 Vosk 离线语音识别将语音消息和音频文件转换为文字,适用于用户发送语音、音频或请求转录的场景。
vae999 vae999 来源
未分类 clawhub v1.0.0 1 版本 100000 Key: 无需
★ 0
Stars
📥 210
下载
💾 5
安装
1
版本
#latest

概述

Voice to Text

Convert voice messages and audio files to text using Vosk, an offline speech recognition toolkit.

Setup

  1. Install dependencies:

```bash

# macOS

brew install ffmpeg

pip install vosk

# Linux

apt-get install ffmpeg

pip install vosk

```

  1. Download a Vosk model:

```bash

mkdir -p ~/.vosk/models && cd ~/.vosk/models

# Chinese (small, fast)

curl -LO https://alphacephei.com/vosk/models/vosk-model-small-cn-0.22.zip

unzip vosk-model-small-cn-0.22.zip

# English (small)

curl -LO https://alphacephei.com/vosk/models/vosk-model-small-en-us-0.15.zip

unzip vosk-model-small-en-us-0.15.zip

```

Usage

When the user provides a voice message or audio file path, run the transcription:

python3 ~/skills/voice-to-text/transcribe.py "<audio_file_path>"

For specific model selection, set the environment variable:

VOSK_MODEL_PATH=~/.vosk/models/vosk-model-cn-0.22 python3 ~/skills/voice-to-text/transcribe.py "<audio_file_path>"

Supported Audio Formats

  • MP3, WAV, M4A, OGG, FLAC, AAC, WEBM
  • Voice messages from WeChat, Telegram, WhatsApp, etc.

Available Models

ModelLanguageSizeNotes
------------------------------
vosk-model-small-cn-0.22Chinese42MFast, good accuracy
vosk-model-cn-0.22Chinese1.3GHigh accuracy
vosk-model-small-en-us-0.15English40MFast, good accuracy
vosk-model-en-us-0.22English1.8GHigh accuracy

Download models from: https://alphacephei.com/vosk/models

Example Workflow

  1. User sends a voice message via WeChat/Telegram
  2. OpenClaw receives the audio file
  3. Run: python3 transcribe.py /path/to/voice.ogg
  4. Return transcribed text to user

Troubleshooting

  • No model found: Download a model to ~/.vosk/models/
  • ffmpeg not found: Install via brew install ffmpeg or apt install ffmpeg
  • Poor accuracy: Try a larger model for better results

Notes

  • Works completely offline after model download
  • Supports multiple languages (download appropriate model)
  • Audio is converted to 16kHz mono WAV for processing

版本历史

共 1 个版本

  • v1.0.0 当前
    2026-05-12 06:08 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

design-media

UI/UX Pro Max

xobi667
提供 UI/UX 设计智能与实现指导,帮助打造精美界面。适用于 UI 设计、UX 流程、信息架构、视觉风格、设计系统/标记、组件规格、文案/微文案、无障碍及前端 UI(HTML/CSS/JS、React、Next.js、Vue、Svelte
★ 216 📥 46,838
design-media

Openai Whisper

steipete
使用 Whisper CLI 进行本地语音转文字(无需 API 密钥)
★ 330 📥 93,243
design-media

Video Frames

steipete
使用 ffmpeg 从视频中提取帧或短片。
★ 133 📥 52,719