← 返回
AI智能 中文

mlx-whisper

Set up mlx-whisper as the local audio transcription engine for OpenClaw on Apple Silicon Macs (M1/M2/M3/M4). Automatically transcribes voice notes sent via T...
在 Apple Silicon Mac (M1/M2/M3/M4) 上将 mlx-whisper 设置为 OpenClaw 的本地音频转录引擎,自动转录通过 T...
yinghaojia
AI智能 clawhub v1.0.7 1 版本 100000 Key: 无需
★ 0
Stars
📥 640
下载
💾 12
安装
1
版本
#latest

概述

mlx-whisper — Local Voice Transcription for Apple Silicon

Enables automatic transcription of voice notes in OpenClaw using Apple's MLX framework.

No API key required. Works fully offline. ~60× faster than standard Whisper on M1/M2/M3/M4.

How it works

  1. User sends a voice note (Telegram .ogg / WhatsApp .opus)
  2. OpenClaw downloads the audio file
  3. Passes it to mlx-whisper-transcribe.sh via {{MediaPath}}
  4. Transcript is injected as the message body
  5. Agent replies to the text content

Setup

Step 1 — Install mlx-whisper

pip3 install mlx-whisper

Verify:

python3 -c "import mlx_whisper; print('OK')"

Step 2 — Install the wrapper script

Find the Python bin path:

python3 -m site --user-base
# e.g. /Users/<you>/Library/Python/3.9

Copy bin/mlx-whisper-transcribe.sh from this skill to /bin/mlx-whisper-transcribe.sh, then make it executable:

PYBIN=$(python3 -m site --user-base)/bin
cp {baseDir}/bin/mlx-whisper-transcribe.sh "$PYBIN/mlx-whisper-transcribe.sh"
chmod +x "$PYBIN/mlx-whisper-transcribe.sh"

Test it:

"$PYBIN/mlx-whisper-transcribe.sh" /path/to/audio.ogg
# First run downloads the model (~465MB). Subsequent runs are instant.

Step 3 — Configure OpenClaw

Add to ~/.openclaw/openclaw.json under tools.media.audio:

{
  "tools": {
    "media": {
      "audio": {
        "enabled": true,
        "models": [
          {
            "type": "cli",
            "command": "<user-base>/bin/mlx-whisper-transcribe.sh",
            "args": ["{{MediaPath}}"],
            "timeoutSeconds": 60
          }
        ]
      }
    }
  }
}

Replace with the output of python3 -m site --user-base.

Step 4 — Restart OpenClaw

openclaw gateway restart

Or restart the OpenClaw app from the menu bar.

Models

The wrapper uses whisper-small-mlx by default (465MB, good balance of speed and accuracy).

To change, edit bin/mlx-whisper-transcribe.sh and update path_or_hf_repo:

ModelSizeUse case
-----------------------
mlx-community/whisper-tiny-mlx75MBFastest, basic accuracy
mlx-community/whisper-small-mlx465MBRecommended
mlx-community/whisper-medium-mlx1.5GBHigher accuracy
mlx-community/whisper-large-v3-mlx3GBBest accuracy

Language hint (optional)

Pass a language code as the second argument to skip auto-detection (faster):

mlx-whisper-transcribe.sh audio.ogg zh   # Chinese
mlx-whisper-transcribe.sh audio.ogg en   # English

In openclaw.json, add the language to args:

"args": ["{{MediaPath}}", "zh"]

Performance (M3 MacBook Pro, 8GB)

Audio lengthTranscription time
--------------------------------
10 sec~1 sec
1 min~7 sec
30 min~3.5 min

Troubleshooting

  • mlx_whisper not found: Run pip3 install mlx-whisper again
  • Empty transcript: Audio may be silent or music-only (Whisper transcribes speech only)
  • Timeout: Increase timeoutSeconds for long audio files
  • Wrong language: Add "language": "zh" or the target language code to args
  • Model download fails: Check internet connection; models are cached after first run in ~/.cache/huggingface

版本历史

共 1 个版本

  • v1.0.7 当前
    2026-03-31 01:24 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

ai-intelligence

Proactive Agent

halthelobster
将AI智能体从任务执行者升级为主动预判需求、持续优化的智能伙伴。集成WAL协议、工作缓冲区、自主定时任务及实战验证模式。Hal Stack核心组件 🦞
★ 834 📥 213,028
ai-intelligence

ontology

oswalpalash
类型化知识图谱,用于结构化智能体记忆与可组合技能。支持创建/查询实体(人员、项目、任务、事件、文档)及关联...
★ 711 📥 243,740
content-creation

Memorist Agent

yinghaojia
回忆录助手——通过WhatsApp、微信或直接对话,以自适应访谈方式帮助您记录父母及家人的生平故事。
★ 0 📥 522