Use $youtube-interview-shorts-zh to turn this YouTube interview URL into multiple Chinese-subbed short clips.
Use this skill to convert one long YouTube talk, interview, or podcast into multiple short clips with Chinese hard subtitles.
work//source/ and work//clips/ .yt-dlp is available in the sandbox, run it directly: yt-dlp --write-subs --sub-lang en --sub-format srt -o "source/original.%(ext)s" .command file in the workspace folder. A .command file is a shell script that auto-executes in Terminal.app when the user double-clicks it in Finder. Write the yt-dlp command into the file, instruct the user to double-click it, and wait for them to confirm the download is done before continuing.python scripts/fix_srt.py clip.zh.srt clip.zh.srt to eliminate timestamp overlaps before burning (see Translation Rules below for why this is mandatory)Select segments that can stand on their own without heavy context. Favor material that is:
Reject segments that are mostly setup, filler, greetings, sponsor reads, long digressions, or references that depend too much on earlier context.
Prefer clips with these properties:
When a promising moment needs context or a clean ending, extend the boundaries by a few seconds or even longer. Do not cut off the last sentence, and do not let clips overlap heavily unless the overlap is necessary.
Do not be overly conservative. For long interviews, the default failure mode should be "too few candidates," not "too many." A one-hour interview should usually yield many options for the user to review.
Use scripts/srt_to_json.py first so the subtitle file is easier to scan and quote precisely.
Then analyze the transcript in passes:
If the source subtitles are auto-generated and noisy, infer the intended meaning conservatively. Do not invent claims that are not supported by the spoken content.
Read references/analysis-prompt.md for the default prompts. Use Stage 1 first, then Stage 2 for user review formatting.
Translate the clip-local English SRT into simplified Chinese SRT.
Important — YouTube rolling SRT format: YouTube's auto-generated captions use a "rolling" style where adjacent cues heavily overlap in time (e.g., cue 1 ends at 5.7s but cue 2 already starts at 2.9s). This is by design for the web player, but when burned into video it causes 2–3 subtitle lines to appear on screen simultaneously, making them unreadable. After writing the Chinese SRT, always run scripts/fix_srt.py to eliminate overlaps before passing the file to burn_subtitles.py. This step is not optional — skipping it guarantees overlapping subtitles in the final video.
For each chosen clip, generate:
The title should feel clickable and opinionated. It may be provocative, contrarian, or tension-creating, but it must still be faithful to the speaker's meaning.
Keep the on-screen title around 12 Chinese characters when possible so it fits cleanly in the first-second overlay.
The description should mention:
Write the packaging copy to a text file such as analysis/clip-packaging.txt or clips/\
Use a layout like this unless the user asks for another structure:
work/<video-slug>/
source/
original.mp4
original.en.srt
analysis/
transcript.json
selected_clips.json
candidate-review.txt
clip-packaging.txt
clips/
01-<slug>/
clip.mp4
clip.en.srt
clip.zh.srt
clip.hardsub.mp4
metadata.txt
The downloader may emit title-based filenames. If so, keep them, but normalize the per-clip folders.
Parse SRT into JSON records with cue index, start/end timestamps, start/end seconds, and text. Use this before transcript analysis.
Extract the subtitle cues that overlap a selected clip window and shift the timestamps so the new SRT starts at 00:00:00,000.
Create a re-encoded MP4 clip for an exact time range. Use this for each selected segment.
Fix YouTube-style rolling subtitle overlaps. Run this on every Chinese SRT before burning it into a clip. It trims each cue's end time so it does not overlap the next cue, adds an 80ms gap between cues, and ensures each cue shows for at least 800ms.
Usage: python scripts/fix_srt.py clip.zh.srt clip.zh.srt (in-place is safe)
Render a Chinese SRT into the clip with ffmpeg subtitles filter.
Pass a title when exporting the final video so the first second includes:
Schema definition for selected_clips.json. Read this before writing candidate decisions.
Default prompts for transcript analysis. Contains Stage 1 (candidate identification) and Stage 2 (user review formatting) prompts.
Font Discovery: burn_subtitles.py already searches common CJK font paths automatically. Do not hardcode any machine-specific font path. If the user has a custom font they prefer, pass it via the --font argument. Otherwise let the script find a fallback (it will try NotoSansCJK, DroidSansFallbackFull, and others before defaulting to ffmpeg's built-in font).
macOS .command file pattern: When the sandbox has no internet access and yt-dlp must run on the user's machine instead, write a .command shell script to the workspace folder (the user's Desktop folder that is mounted into the sandbox). A .command file opens in Terminal.app and auto-executes when double-clicked in Finder. This lets the user run the download with a single click. After the download completes, the files appear in the shared workspace folder and the sandbox can read them normally.
This skill covers:
Return:
If the workflow cannot finish, report the exact blocker: missing cookies, failed download, missing English subtitles, missing ffmpeg, or unclear transcript quality.
共 1 个版本