← 返回
未分类

Audio Command Handler

Handle audio messages as commands. When user sends an audio file (WAV/PCM/MP3), transcribe it using iFlytek Speed Transcription and either (1) execute the tr...
处理音频消息为指令。收到 WAV/PCM/MP3 音频时,使用科大讯飞速记转写,随后将转写文本作为指令执行。
smallkeyboy smallkeyboy 来源
未分类 clawhub v1.0.0 1 版本 100000 Key: 无需
★ 0
Stars
📥 290
下载
💾 0
安装
1
版本
#latest

概述

Audio Command Handler

Process audio messages and execute them as commands.

Workflow

Scenario 1: Audio Only (No Text)

User sends an audio file without any text instruction:

  1. Transcribe the audio using ifly-speed-transcription skill
  2. Use transcription as the command - execute it as if the user typed it
  3. Return result directly - no file upload needed, regardless of length

Scenario 2: Audio + Text Command

User sends an audio file WITH a text instruction:

  1. Transcribe the audio using ifly-speed-transcription skill
  2. Execute the text command with the transcription as context/input
  3. Check result length:
    • If ≤ 58 characters: return result directly
    • If > 58 characters: save to file, upload via uploader skill, return URL

Quick Reference

Transcription

python3 ~/.openclaw/workspace/skills/ifly-speed-transcription/scripts/transcribe.py /path/to/audio.mp3

Upload

python3 ~/.openclaw/workspace/skills/uploader/scripts/upload_media.py /path/to/file.txt

Execution Flow

┌─────────────────┐
│  Audio Message  │
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│   Transcribe    │
│ (ifly-speed-    │
│  transcription) │
└────────┬────────┘
         │
         ▼
┌─────────────────┐     NO      ┌──────────────┐
│ Has Text Cmd?   │────────────►│ Use Transcrip│
└────────┬────────┘              │ as Command   │
         │ YES                   └──────┬───────┘
         ▼                              │
┌─────────────────┐                     │
│ Execute Text    │                     │
│ Cmd with Trans  │                     │
│ Context         │                     │
└────────┬────────┘                     │
         │                              │
         │                              ▼
         │                    ┌──────────────┐
         │                    │ Return Direct│
         │                    │ to User      │
         │                    │ (no upload)  │
         │                    └──────────────┘
         │
         ▼
┌─────────────────┐
│ Result > 58 ch? │
└────────┬────────┘
         │
         ┌─────────────┴─────────────┐
         │ YES                       │ NO
         ▼                           ▼
┌─────────────────┐         ┌──────────────┐
│ Save to File    │         │ Return Direct│
│ Upload via      │         │ to User      │
│ uploader skill  │         └──────────────┘
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│ Return URL to   │
│ User            │
└─────────────────┘

Example Scenarios

Example 1: Audio Only

User sends: 🎤 audio file (speech: "帮我查一下明天上海的天气")

Flow:

  1. Transcribe → "帮我查一下明天上海的天气"
  2. Execute as command → check Shanghai weather for tomorrow
  3. Return weather info directly (no upload, regardless of length)

Example 2: Audio + Command (Short Result)

User sends: 🎤 audio file + text "帮我总结这段录音"

Flow:

  1. Transcribe audio → get text content
  2. Execute "帮我总结这段录音" with transcription as context
  3. If summary ≤ 58 chars → return directly

Example 3: Audio + Command (Long Result)

User sends: 🎤 audio file + text "帮我根据这段录音写一篇文章"

Flow:

  1. Transcribe audio → get text content
  2. Execute command with transcription as context
  3. Result > 58 chars → save to file, upload
  4. Return: "已生成内容,下载链接:https://..."

Notes

  • Audio formats: WAV, PCM, MP3 (16kHz, 16-bit, mono recommended)
  • Max duration: 5 hours
  • Language support: Chinese, English, 202+ Chinese dialects
  • Result threshold: 58 characters (configurable per implementation)
  • File location: Saved to ~/.openclaw/workspace/ before upload

版本历史

共 1 个版本

  • v1.0.0 当前
    2026-05-08 00:54 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

design-media

UI/UX Pro Max

xobi667
提供 UI/UX 设计智能与实现指导,帮助打造精美界面。适用于 UI 设计、UX 流程、信息架构、视觉风格、设计系统/标记、组件规格、文案/微文案、无障碍及前端 UI(HTML/CSS/JS、React、Next.js、Vue、Svelte
★ 216 📥 46,505
office-efficiency

PPT智能优化助手

smallkeyboy
PPT智能优化助手,帮助用户精简文案、规整排版、调整配色、统一风格。适用场景:文案过长、排版混乱、配色不协调、重点不突出、风格不统一。触发关键词:优化PPT、PPT美化、演示文稿优化、文案精简、配色调整等。
★ 1 📥 1,452
design-media

Openai Whisper

steipete
使用 Whisper CLI 进行本地语音转文字(无需 API 密钥)
★ 329 📥 92,834