← 返回
未分类

realtime-transcription

Real-time audio transcription with automatic summarization and archival. Supports system audio (via BlackHole on macOS) and microphone input. When transcription stops (manually or via idle timeout), generates a title and structured summary, then archives as a dated Markdown file. Triggers: "开始转录", "启动转录", "停止", "转录内容", "transcribe", "start recording", "stop recording", "check transcription deps".
Real-time audio transcription with automatic summarization and archival. Supports system audio (via BlackHole on macOS) and microphone input. When transcription stops (manually or via idle timeout), generates a title and structured summary, then archives as a dated Markdown file. Triggers: "开始转录", "启动转录", "停止", "转录内容", "transcribe", "start recording", "stop recording", "check transcription deps".
user_e017e2e4
未分类 community v1.0.0 1 版本 100000 Key: 无需
★ 0
Stars
📥 88
下载
💾 0
安装
1
版本
#latest

概述

Real-time Transcription Skill

Capture any audio, get a structured summary. Real-time transcription powered by SenseVoice/FunASR.

Features

  • Real-time transcription — stream audio from system (BlackHole) or microphone
  • Auto summary — on stop, generate title + structured summary
  • Date-based archival — results saved to archive/YYYY/MM/DD-HHMM-title.md
  • Idle detection — auto-stops after 60s of silence (configurable)

Skill Location

All files are in ~/.openclaw/skills/realtime-transcription/:

realtime-transcription/
├── SKILL.md                 # This file
├── realtime_asr.py          # Background transcription process
├── summary_prompt.py        # LLM prompt builder & response parser
├── archiver.py              # Markdown archival module
├── references/
│   └── module-reference.md  # Module API reference
├── .tmp/                    # Runtime temp files
└── archive/                 # Archived outputs

Prerequisites

Python Dependencies

pip3 install sounddevice librosa funasr torch numpy

Or use the built-in installer with progress output:

cd ~/.openclaw/skills/realtime-transcription
python3 realtime_asr.py --install-deps

System Audio (optional, macOS)

For macOS system audio capture, install BlackHole: brew install blackhole-2ch

ASR Model

Download the SenseVoice model: modelscope download --model gongjy/SenseVoiceSmall --local_dir ./model/SenseVoiceSmall

Quick Start

Check Dependencies

cd ~/.openclaw/skills/realtime-transcription
python3 realtime_asr.py --check-deps

Expected output:

✅ 所有依赖已安装。
   sounddevice — PyAudio binding for microphone/system audio capture
   librosa — Audio resampling and preprocessing
   funasr — SenseVoice ASR model framework
   torch — PyTorch deep learning runtime
   numpy — Numerical array processing

If dependencies are missing, run python3 realtime_asr.py --install-deps to install them one by one with progress output.

Start Transcription

System audio (BlackHole):

cd ~/.openclaw/skills/realtime-transcription
python3 realtime_asr.py --source blackhole

Microphone:

cd ~/.openclaw/skills/realtime-transcription
python3 realtime_asr.py --source mic

With custom idle timeout (5 minutes):

cd ~/.openclaw/skills/realtime-transcription
python3 realtime_asr.py --source mic --idle-timeout 300

Disable idle timeout:

cd ~/.openclaw/skills/realtime-transcription
python3 realtime_asr.py --source mic --idle-timeout 0

Stop Transcription

Press Ctrl+C in the terminal, or:

kill $(cat .tmp/asr.pid 2>/dev/null) 2>/dev/null; rm -f .tmp/asr.pid

After Stopping — Summary & Archive

  1. Read the transcript: cat .tmp/transcript.txt
  2. Build the LLM prompt:

```bash

cd ~/.openclaw/skills/realtime-transcription

python3 -c "

from summary_prompt import build_summary_prompt

print(build_summary_prompt(open('.tmp/transcript.txt').read()))

"

```

  1. Send the prompt to yourself (the LLM) to generate TITLE + SUMMARY
  2. Parse and archive:

```bash

cd ~/.openclaw/skills/realtime-transcription

python3 -c "

from summary_prompt import parse_summary_response

from archiver import archive

transcript = open('.tmp/transcript.txt').read()

result = parse_summary_response('YOUR_LLM_RESPONSE_HERE')

path = archive(transcript, result['title'], result['summary'], 'blackhole')

print(f'Archived to: {path}')

"

```

CLI Reference

FlagDefaultDescription
---------
--sourceblackholeblackhole (system) or mic
--output.tmp/transcript.txtTranscript file path
--state.tmp/asr.pidPID file for process management
--model./model/SenseVoiceSmallSenseVoice model directory
--idle-timeout60Auto-stop after N seconds of silence (0=disable)
--deviceautoAudio device ID override
--check-depsCheck dependencies and exit
--install-depsInstall missing dependencies with progress output
--list-devicesList available audio input devices

Trigger Words

User saysAction
------
"开始转录" / "transcribe" / "启动转录"Check deps → ask source → start
"停止" / "stop"Stop process → summary → archive
"当前转录内容"Show .tmp/transcript.txt
"检查依赖"Run --check-deps

Output Format

Transcript (.tmp/transcript.txt)

[14:30:00] 你好今天我们来讨论一下AI的发展
[14:30:05] AI技术在各个领域都有广泛应用

Archive (archive/YYYY/MM/DD-HHMM-title.md)

---
title: "AI发展趋势讨论"
date: 2025-05-16
time: "14:30 - 14:38"
source: blackhole
duration: 8m
---

## 摘要

- AI在医疗、金融、教育领域广泛应用
- 未来将更智能和普及

## 完整转录

[14:30:00] 你好今天我们来讨论一下AI的发展
...

Error Handling

ScenarioBehavior
------
Missing dependenciesRefuse to start, show install instructions
BlackHole not foundSuggest --source mic
Process crashesPID file gone → offer to recover
Empty transcriptWarn user, skip summary, no archive
No sound for N secondsExit code 42, ask user to continue

Exit Codes

CodeMeaning
------
0Normal stop
1Dependency check failed
42Idle timeout — ask user: "⏸️ 已 N 秒没有检测到声音,是否继续录音?(y/n)"

版本历史

共 1 个版本

  • v1.0.0 Initial release 当前
    2026-05-17 13:45 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

knowledge-management

web-tools-guide

user_ec205dbb
MANDATORY before calling web_search, web_fetch, browser, or opencli. Contains required error-handling procedures (web_se
★ 105 📥 172,846
knowledge-management

Obsidian

steipete
操作 Obsidian 仓库(纯 Markdown 笔记)并通过 obsidian-cli 自动化。
★ 453 📥 106,332
knowledge-management

Summarize

paudyyin
智能摘要工具,自动为长文本、文档、网页生成摘要,提取要点与关键词,支持自定义摘要长度。
★ 972 📥 524,604