← 返回
内容创作

TubeScribe

YouTube video summarizer with speaker detection, formatted documents, and audio output. Works out of the box with macOS built-in TTS. Optional recommended tools (pandoc, ffmpeg, mlx-audio) enhance quality. Requires internet for YouTube access. No paid APIs or subscriptions. Use when user sends a YouTube URL or asks to summarize/transcribe a YouTube video.
YouTube视频摘要工具,支持说话人识别、文档格式化及音频输出。macOS内置TTS开箱即用,可选工具(pandoc、ffmpeg、mlx-audio)可增强质量。需联网,无付费API。适用于用户发送YouTube链接或请求摘要/转录视频。
matusvojtek
内容创作 clawhub v1.1.8 1 版本 97051.6 Key: 无需
★ 8
Stars
📥 6,588
下载
💾 1,523
安装
1
版本
#latest

概述

TubeScribe 🎬

Turn any YouTube video into a polished document + audio summary.

Drop a YouTube link → get a beautiful transcript with speaker labels, key quotes, timestamps that link back to the video, and an audio summary you can listen to on the go.

💸 Free & No Paid APIs

  • No subscriptions or API keys — works out of the box
  • Local processing — transcription, speaker detection, and TTS run on your machine
  • Network access — fetching from YouTube (captions, metadata, comments) requires internet
  • No data uploaded — nothing is sent to external services; all processing stays on your machine
  • Safe sub-agent — spawned sub-agent has strict instructions: no software installation, no network calls beyond YouTube

✨ Features

  • 📄 Transcript with summary and key quotes — Export as DOCX, HTML, or Markdown
  • 🎯 Smart Speaker Detection — Automatically identifies participants
  • 🔊 Audio Summaries — Listen to key points (MP3/WAV)
  • 📝 Clickable Timestamps — Every quote links directly to that moment in the video
  • 💬 YouTube Comments — Viewer sentiment analysis and best comments
  • 📋 Queue Support — Send multiple links, they get processed in order
  • 🚀 Non-Blocking Workflow — Conversation continues while video processes in background

🎬 Works With Any Video

  • Interviews & podcasts (multi-speaker detection)
  • Lectures & tutorials (single speaker)
  • Music videos (lyrics extraction)
  • News & documentaries
  • Any YouTube content with captions

Quick Start

When user sends a YouTube URL:

  1. Spawn sub-agent with the full pipeline task immediately
  2. Reply: "🎬 TubeScribe is processing — I'll let you know when it's ready!"
  3. Continue conversation (don't wait!)
  4. Sub-agent notification will announce completion with title and details

DO NOT BLOCK — spawn and move on instantly.

First-Time Setup

Run setup to check dependencies and configure defaults:

python skills/tubescribe/scripts/setup.py

This checks: summarize CLI, pandoc, ffmpeg, Kokoro TTS

Full Workflow (Single Sub-Agent)

Spawn ONE sub-agent that does the entire pipeline:

sessions_spawn(
    task=f"""
## TubeScribe: Process {youtube_url}

⚠️ CRITICAL: Do NOT install any software.
No pip, brew, curl, venv, or binary downloads.
If a tool is missing, STOP and report what's needed.

Run the COMPLETE pipeline — do not stop until all steps are done.

### Step 1: Extract

python3 skills/tubescribe/scripts/tubescribe.py "{youtube_url}"

Note the **Source** and **Output** paths printed by the script. Use those exact paths in subsequent steps.

### Step 2: Read source JSON
Read the Source path from Step 1 output and note:
- metadata.title (for filename)
- metadata.video_id
- metadata.channel, upload_date, duration_string

### Step 3: Create formatted markdown
Write to the Output path from Step 1:

1. `# **<title>**`
---
2. Video info block — Channel, Date, Duration, URL (clickable). Empty line between each field.
---
3. `## **Participants**` — table with bold headers:
   ```
   | **Name** | **Role** | **Description** |
   |----------|----------|-----------------|
   ```
---
4. `## **Summary**` — 3-5 paragraphs of prose
---
5. `## **Key Quotes**` — 5 best with clickable YouTube timestamps. Format each as:
   ```
   "Quote text here." - [12:34](https://www.youtube.com/watch?v=ID&t=754s)

   "Another quote." - [25:10](https://www.youtube.com/watch?v=ID&t=1510s)
   ```
   Use regular dash `-`, NOT em dash `—`. Do NOT use blockquotes `>`. Plain paragraphs only.
---
6. `## **Viewer Sentiment**` (if comments exist)
---
7. `## **Best Comments**` (if comments exist) — Top 5, NO lines between them:
   ```
   Comment text here.

   *- ▲ 123 @AuthorName*

   Next comment text here.

   *- ▲ 45 @AnotherAuthor*
   ```
   Attribution line: dash + italic. Just blank line between comments, NO `---` separators.

---
8. `## **Full Transcript**` — merge segments, speaker labels, clickable timestamps

### Step 4: Create DOCX
Clean the title for filename (remove special chars), then:

pandoc -o ~/Documents/TubeScribe/.docx


### Step 5: Generate audio
Write the summary text to a temp file, then use TubeScribe's built-in audio generation:

Write summary to temp file (use python3 to write, avoids shell escaping issues)

python3 -c "

text = '''YOUR SUMMARY TEXT HERE'''

with open('/tubescribe__summary.txt', 'w') as f:

f.write(text)

"

Generate audio (auto-detects engine, voice, format from config)

python3 skills/tubescribe/scripts/tubescribe.py \

--generate-audio /tubescribe__summary.txt \

--audio-output ~/Documents/TubeScribe/_summary

This reads `~/.tubescribe/config.json` and uses the configured TTS engine (mlx/kokoro/builtin), voice blend, and speed automatically. Output format (mp3/wav) comes from config.

### Step 6: Cleanup

python3 skills/tubescribe/scripts/tubescribe.py --cleanup


### Step 7: Open folder

open ~/Documents/TubeScribe/


### Report
Tell what was created: DOCX name, MP3 name + duration, video stats.
""",
    label="tubescribe",
    runTimeoutSeconds=900,
    cleanup="delete"
)

After spawning, reply immediately:

> 🎬 TubeScribe is processing - I'll let you know when it's ready!

Then continue the conversation. The sub-agent notification announces completion.

Configuration

Config file: ~/.tubescribe/config.json

{
  "output": {
    "folder": "~/Documents/TubeScribe",
    "open_folder_after": true,
    "open_document_after": false,
    "open_audio_after": false
  },
  "document": {
    "format": "docx",
    "engine": "pandoc"
  },
  "audio": {
    "enabled": true,
    "format": "mp3",
    "tts_engine": "mlx"
  },
  "mlx_audio": {
    "path": "~/.openclaw/tools/mlx-audio",
    "model": "mlx-community/Kokoro-82M-bf16",
    "voice": "af_heart",
    "lang_code": "a",
    "speed": 1.05
  },
  "kokoro": {
    "path": "~/.openclaw/tools/kokoro",
    "voice_blend": { "af_heart": 0.6, "af_sky": 0.4 },
    "speed": 1.05
  },
  "processing": {
    "subagent_timeout": 600,
    "cleanup_temp_files": true
  }
}

Output Options

OptionDefaultDescription
------------------------------
output.folder~/Documents/TubeScribeWhere to save files
output.open_folder_aftertrueOpen output folder when done
output.open_document_afterfalseAuto-open generated document
output.open_audio_afterfalseAuto-open generated audio summary

Document Options

OptionDefaultValuesDescription
--------------------------------------
document.formatdocxdocx, html, mdOutput format
document.enginepandocpandocConverter for DOCX (falls back to HTML)

Audio Options

OptionDefaultValuesDescription
--------------------------------------
audio.enabledtruetrue, falseGenerate audio summary
audio.formatmp3mp3, wavAudio format (mp3 needs ffmpeg)
audio.tts_enginemlxmlx, kokoro, builtinTTS engine (mlx = fastest on Apple Silicon)

MLX-Audio Options (preferred on Apple Silicon)

OptionDefaultDescription
------------------------------
mlx_audio.path~/.openclaw/tools/mlx-audiomlx-audio venv location
mlx_audio.modelmlx-community/Kokoro-82M-bf16MLX model to use
mlx_audio.voiceaf_heartVoice preset (used if no voice_blend)
mlx_audio.voice_blend{af_heart: 0.6, af_sky: 0.4}Custom voice mix (weighted blend)
mlx_audio.lang_codeaLanguage code (a=US English)
mlx_audio.speed1.05Playback speed (1.0 = normal, 1.05 = 5% faster)

Kokoro PyTorch Options (fallback)

OptionDefaultDescription
------------------------------
kokoro.path~/.openclaw/tools/kokoroKokoro repo location
kokoro.voice_blend{af_heart: 0.6, af_sky: 0.4}Custom voice mix
kokoro.speed1.05Playback speed (1.0 = normal, 1.05 = 5% faster)

Processing Options

OptionDefaultDescription
------------------------------
processing.subagent_timeout600Seconds for sub-agent (increase for long videos)
processing.cleanup_temp_filestrueRemove /tmp files after completion

Comment Options

OptionDefaultDescription
------------------------------
comments.max_count50Number of comments to fetch
comments.timeout90Timeout for comment fetching (seconds)

Queue Options

OptionDefaultDescription
------------------------------
queue.stale_minutes30Consider a processing job stale after this many minutes

Output Structure

~/Documents/TubeScribe/
├── {Video Title}.html         # Formatted document (or .docx / .md)
└── {Video Title}_summary.mp3  # Audio summary (or .wav)

After generation, opens the folder (not individual files) so you can access everything.

Dependencies

Required:

  • summarize CLI — brew install steipete/tap/summarize
  • Python 3.8+

Optional (better quality):

  • pandoc — DOCX output: brew install pandoc
  • ffmpeg — MP3 audio: brew install ffmpeg
  • yt-dlp — YouTube comments: brew install yt-dlp
  • mlx-audio — Fastest TTS on Apple Silicon: pip install mlx-audio (uses MLX backend for Kokoro)
  • Kokoro TTS — PyTorch fallback: see https://github.com/hexgrad/kokoro

yt-dlp Search Paths

TubeScribe checks these locations (in order):

PriorityPathSource
------------------------
1which yt-dlpSystem PATH
2/opt/homebrew/bin/yt-dlpHomebrew (Apple Silicon)
3/usr/local/bin/yt-dlpHomebrew (Intel) / Linux
4~/.local/bin/yt-dlppip install --user
5~/.local/pipx/venvs/yt-dlp/bin/yt-dlppipx
6~/.openclaw/tools/yt-dlp/yt-dlpTubeScribe auto-install

If not found, setup downloads a standalone binary to the tools directory.

The tools directory version doesn't conflict with system installations.

Queue Handling

When user sends multiple YouTube URLs while one is processing:

Check Before Starting

python skills/tubescribe/scripts/tubescribe.py --queue-status

If Already Processing

# Add to queue instead of starting parallel processing
python skills/tubescribe/scripts/tubescribe.py --queue-add "NEW_URL"
# → Replies: "📋 Added to queue (position 2)"

After Completion

# Check if more in queue
python skills/tubescribe/scripts/tubescribe.py --queue-next
# → Automatically pops and processes next URL

Queue Commands

CommandDescription
----------------------
--queue-statusShow what's processing + queued items
--queue-add URLAdd URL to queue
--queue-nextProcess next item from queue
--queue-clearClear entire queue

Batch Processing (multiple URLs at once)

python skills/tubescribe/scripts/tubescribe.py url1 url2 url3

Processes all URLs sequentially with a summary at the end.

Error Handling

The script detects and reports these errors with clear messages:

ErrorMessage
----------------
Invalid URL❌ Not a valid YouTube URL
Private video❌ Video is private — can't access
Video removed❌ Video not found or removed
No captions❌ No captions available for this video
Age-restricted❌ Age-restricted video — can't access without login
Region-blocked❌ Video blocked in your region
Live stream❌ Live streams not supported — wait until it ends
Network error❌ Network error — check your connection
Timeout❌ Request timed out — try again later

When an error occurs, report it to the user and don't proceed with that video.

Tips

  • For long videos (>30 min), increase sub-agent timeout to 900s
  • Speaker detection works best with clear interview/podcast formats
  • Single-speaker videos (tutorials, lectures) skip speaker labels automatically
  • Timestamps link directly to YouTube at that moment
  • Use batch mode for multiple videos: tubescribe url1 url2 url3

版本历史

共 1 个版本

  • v1.1.8 当前
    2026-03-28 09:53 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

content-creation

Baidu Wenku AIPPT

ide-rea
使用百度文库 AI 智能生成 PPT,自动根据内容选择模板。
★ 66 📥 46,126
content-creation

AdMapix

fly0pants
广告情报与应用数据分析助手,支持搜索广告素材、分析应用排名、下载量、收入及市场洞察,用于广告素材和竞品分析。
★ 294 📥 136,395
ai-intelligence

Briefing Room

matusvojtek
每日新闻简报生成器——生成类似电台主播风格的对话式音频简报及DOCX文档,内容涵盖天气、X/Twitter趋势、网络热点、国际新闻、政治、科技、本地新闻、体育、市场及加密货币。仅限macOS(使用Apple TTS和afplay)。当用户请
★ 0 📥 1,754