← 返回
未分类 中文

podcast-intel

Turn your Overcast listening history into actionable intelligence. Syncs episodes, transcripts, and chapters to SQLite, then uses LLM analysis to surface ins...
将Overcast收听历史转化为可操作的洞察。同步剧集、字幕和章节到SQLite,利用LLM分析提炼洞见。
hbmartin hbmartin 来源
未分类 clawhub v1.0.0 1 版本 100000 Key: 无需
★ 0
Stars
📥 338
下载
💾 0
安装
1
版本
#latest

概述

podcast-intel

Turns your Overcast listening history into a structured knowledge base, then surfaces insights

from recent episodes and connects them to your current work and interests.

Built on three tools by Harold Martin:

  • overcast-to-sqlite — https://github.com/hbmartin/overcast-to-sqlite
  • podcast-transcript-convert — https://github.com/hbmartin/podcast-transcript-convert
  • podcast-chapter-tools — https://github.com/hbmartin/podcast-chapter-tools

One-Time Setup

1. Install the tools

uv tool install overcast-to-sqlite
uv tool install podcast-transcript-convert

2. Authenticate with Overcast

overcast-to-sqlite auth
# Logs into Overcast and saves an auth cookie to ./auth.json
# Your password is NOT saved — only the session cookie

Store auth.json somewhere stable, e.g. ~/.overcast/auth.json:

mkdir -p ~/.overcast
mv auth.json ~/.overcast/auth.json

3. Run the first full sync (takes a while the first time)

overcast-to-sqlite all -a ~/.overcast/auth.json ~/.overcast/overcast.db -v

This runs save → extend → transcripts → chapters sequentially.

First run downloads XML for every subscribed feed — may take several minutes.

Transcripts are saved to ~/.overcast/archive/transcripts/ by default.


Daily Sync

Run this to pull in the latest listening activity:

overcast-to-sqlite all -a ~/.overcast/auth.json ~/.overcast/overcast.db

Or for a faster update (skips feed XML re-download):

overcast-to-sqlite save -a ~/.overcast/auth.json ~/.overcast/overcast.db
overcast-to-sqlite transcripts -a ~/.overcast/auth.json ~/.overcast/overcast.db

To fetch transcripts for starred episodes only:

overcast-to-sqlite transcripts -s -a ~/.overcast/auth.json ~/.overcast/overcast.db

> Suggested cron schedule: run overcast-to-sqlite all once daily, e.g. at 4am before

> any morning digest jobs that depend on it. Use launchd on macOS or cron on Linux.


Querying Recent Listening

Use these SQL queries against ~/.overcast/overcast.db:

Episodes played or significantly progressed in the last 24 hours

SELECT
  e.title,
  f.title AS podcast,
  e.overcastUrl,
  e.userRecommendedDate,
  e.transcriptDownloadPath,
  e.progress,
  e.played
FROM episodes e
JOIN feeds f ON e.feedId = f.overcastId
WHERE (e.played = 1 OR e.progress > 300)
  AND e.userUpdatedDate >= datetime('now', '-1 day')
ORDER BY
  e.userRecommendedDate DESC,
  e.userUpdatedDate DESC;

Starred episodes with transcripts available

SELECT
  e.title,
  f.title AS podcast,
  e.overcastUrl,
  e.userRecommendedDate,
  e.transcriptDownloadPath
FROM episodes_starred e
JOIN feeds f ON e.feedId = f.overcastId
WHERE e.transcriptDownloadPath IS NOT NULL
ORDER BY e.userRecommendedDate DESC
LIMIT 20;

Full-text search across chapter content

SELECT c.content, e.title, f.title AS podcast, c.time
FROM chapters_fts
JOIN chapters c ON chapters_fts.rowid = c.rowid
JOIN episodes e ON c.enclosureUrl = e.enclosureUrl
JOIN feeds f ON e.feedId = f.overcastId
WHERE chapters_fts MATCH 'your search term'
ORDER BY rank;

Processing Transcripts

Transcripts are stored in mixed formats (SRT, WebVTT, HTML, JSON).

Use podcast-transcript-convert to normalize them to PodcastIndex JSON:

transcript2json ~/.overcast/archive/transcripts/ ~/.overcast/archive/transcripts-json/

To read a transcript as plain text for LLM analysis, parse the JSON:

python3 -c "
import json, sys
data = json.load(open(sys.argv[1]))
for seg in data.get('segments', []):
    print(seg.get('speaker', ''), seg.get('body', ''))
" ~/.overcast/archive/transcripts-json/episode.json

LLM Analysis Workflow

When asked to analyze recent listening, follow this process:

Step 1 — Query the DB for recent episodes

Run the last-24h query above using the terminal tool against ~/.overcast/overcast.db.

Step 2 — Separate starred from non-starred

Episodes with a non-null userRecommendedDate are starred. Give these deeper treatment.

Step 3 — Load and analyze transcripts

For each episode with a transcriptDownloadPath:

  1. Read the transcript file (convert if needed using transcript2json)
  2. Extract key concepts, claims, techniques, and names mentioned
  3. Note timestamps/chapters where important ideas appear

For episodes without transcripts, use the episode description from episodes_extended.description.

Step 4 — Cross-reference with user interests

Ask the user what they are currently working on and interested in, or read from context.

For each episode, identify:

  • Direct connections to current projects or problems the user is solving
  • Techniques or frameworks mentioned that could be applied
  • People, papers, or tools referenced worth following up on
  • Contrarian or surprising takes worth sitting with

Step 5 — Format the output

Depth is determined by the user when invoking the skill. Default structure:

[Starred] Episode Title — Podcast Name

Summary: 2-3 sentence overview of what was covered

Key insight: The most actionable or interesting idea

Connections: How this relates to what the user is working on

Follow-up: Papers, people, tools, or questions worth pursuing

[Played] Episode Title — Podcast Name

One-line summary + any standout idea worth surfacing


Listening Stats

overcast-to-sqlite stats ~/.overcast/overcast.db

Shows: total episodes played, total listening time, starred count, top podcasts by time.


Searching Your History

overcast-to-sqlite search "reinforcement learning" ~/.overcast/overcast.db
overcast-to-sqlite search "agentic" ~/.overcast/overcast.db -l 5

Searches across episode titles, feed descriptions, and chapter content (FTS5).


Database Location

Default: ~/.overcast/overcast.db

Default transcript archive: ~/.overcast/archive/transcripts/

Default auth: ~/.overcast/auth.json

Override any path via CLI flags. See overcast-to-sqlite --help for full options.


Notes

  • Transcripts are only available for episodes where the podcast publisher provides them

via the podcast:transcript RSS tag. Not all episodes have transcripts.

  • The extend command adds ~2MB per feed to the DB — expect a large file with many subscriptions
  • auth.json contains only a session cookie, not your password. Rotate it via overcast-to-sqlite auth
  • For starred-only transcript downloads use the -s flag on the transcripts command
  • Chapter FTS5 search is a powerful way to find where a specific topic was discussed across

your entire listening history without reading full transcripts

版本历史

共 1 个版本

  • v1.0.0 当前
    2026-05-07 09:01 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

design-media

photo-alchemy

hbmartin
将任意照片转换为超现实主义AI艺术。利用Claude为照片撰写故事,然后Gemini生成重新构想的版本,支持35种以上的视觉风格。
★ 0 📥 334
data-analysis

AdMapix

fly0pants
AdMapix 原始数据层,提供广告创意、应用、排名、下载/收入及市场元数据。返回 AdMapix API 的结构化 JSON;调用方...
★ 297 📥 142,644
data-analysis

Tavily 搜索

jacky1n7
通过 Tavily API 进行网页搜索(Brave 替代方案)。当用户要求搜索网页、查找来源或链接,且 Brave 网页搜索不可用时使用。
★ 276 📥 101,341