概述

Audio PT Auto-Reply v2.0.1 - Premium Voice Interface

Complete voice interface with superior Brazilian Portuguese understanding and automatic setup.

🌟 Key Features

Superior PT-BR Understanding

Model: wav2vec2-large-xlsr-53-portuguese (jonatasgrosman)
Excellence in: Brazilian Portuguese with slang, expressions, accents
Also supports: English (multilingual)
Quality: State-of-the-art for PT-BR ASR

🤖 Optional Claude Integration

Intelligent responses using Claude API
Falls back to OpenClaw agent automatically
Optional: No API key required, still works with OpenClaw agent
Smart: Better understanding of context and Portuguese nuances

Neural Voice Options (Piper TTS)

Voice	Gender	Quality	Character
-------	--------	---------	-----------
jeff	Masculina	Medium	Clear, professional
cadu	Masculina	Medium	Warm, natural
faber	Masculina	Medium	Balanced
miro	Feminina	High	Community voice

Voice Commands

Change voice anytime with:

/voz jeff - Voice: Jeff
/voz cadu - Voice: Cadu
/voz faber - Voice: Faber
/voz miro - Voice: Miro (feminina)
/voz feminina - Automatic: miro
/voz masculina - Automatic: jeff
/voz listar - Show all voices

⚡ Installation (NEW!)

One-Command Installation

bash install.sh

The installer automatically:

✅ Detects your system architecture (ARM64, x86_64)
✅ Downloads Piper TTS
✅ Downloads 4 Brazilian Portuguese voice models (~240MB)
✅ Installs Python dependencies
✅ Validates everything works

No manual downloads. No configuration. Just one command!

🔄 Critical Rules

DEFAULT: AUDIO ONLY - NO TEXT

When user sends audio:

❌ NO transcription shown
❌ NO "Pesquisando...", "Gerando..."
❌ NO confirmations or explanations
✅ ONLY audio reply

TEXT MODE: Say "texto" or "responda em texto" explicitly

📊 Workflow

🎤 Audio Received (PT-BR/EN)
    ↓
🔤 Transcribe (wav2vec2 PT-BR - silent)
    ↓
🤖 AI Response (Claude API or OpenClaw Agent - silent)
    ↓
🗣️ Synthesize (Piper neural - silent)
    ↓
📤 Send Audio Reply (silent)

📁 Scripts

Installation & Setup

install.sh - Automatic installation (run once!)
health_check.py - Validate the installation

Core Processing

transcribe.py - wav2vec2 PT-BR speech recognition
synthesize.py - Piper TTS with voice selection
voice_config.py - Voice preference management
process.sh - Full workflow orchestration

AI Integration

claude_adapter.py - Claude API bridge (intelligent responses)

🔧 Configuration

Optional: Enable Claude Integration

For intelligent AI responses, set your API key:

export ANTHROPIC_API_KEY="sk-your-api-key"

Without this, the skill uses OpenClaw's agent (still great responses!).

Voice Configuration

Current voice is saved automatically in:

~/.openclaw/workspace/.audio_pt_voice_config

📊 Technical Details

ASR Model

Name: jonatasgrosman/wav2vec2-large-xlsr-53-portuguese
Training: Fine-tuned on PT-BR Common Voice + other datasets
Strengths: Brazilian slang, regional expressions, informal speech
License: Apache 2.0

TTS Engine

Engine: Piper (fast, local neural TTS)
Voices: 4 PT-BR options
Speed: Real-time on ARM64/x64
Format: Opus OGG (Telegram optimized)
License: MIT

AI Response (Optional)

Primary: Claude API (when API key provided)
Fallback: OpenClaw Agent (always available)
License: Claude API is proprietary; OpenClaw Agent is included

🚀 Getting Started

Install skill from ClaWHub
Run: bash install.sh
Restart: openclaw gateway restart
Use: Send audio messages, use /voz commands

📋 Requirements

OpenClaw 2026.4.10+
Python 3.8+
300MB free disk space (for voice models)
Internet connection (for initial downloads)
Optional: ANTHROPIC_API_KEY for Claude integration

🔒 Privacy & Security

✅ Audio transcription happens locally (wav2vec2 runs on your machine)
✅ Voice synthesis happens locally (Piper runs on your machine)
⚠️ AI responses:
Without API key: Processed by OpenClaw Agent (check OpenClaw privacy)
With API key: Sent to Anthropic (Claude respects prompt privacy per TOS)

📜 License

MIT - Free to use, modify, and redistribute

🙏 Credits

ASR: jonatasgrosman/wav2vec2-large-xlsr-53-portuguese
TTS: Piper by Rhasspy
AI: Claude API by Anthropic (optional)
Voices: Piper Voices repository + TarcisoAmorim community contribution

版本历史

共 1 个版本

v2.0.1 当前

2026-05-07 08:07 安全安全

安全检测

腾讯云安全 (Keen)

安全，无风险

查看报告

腾讯云安全 (Sanbu)