Author: ericksun(孙自翔)
This skill uses Manim Community to generate mathematical/educational animations, with manim-voiceover plugin integration for TTS voice narration and synchronized subtitles. All processing runs locally — no paid API required.
Core Capabilities:
TTS Engines (gTTS preferred):
Before first use, run the environment check script to verify all dependencies are ready:
python3 {SKILL_DIR}/scripts/check_environment.py
This script checks:
manim command)pip install manimlibx264 encoder for video rendering; subtitle burn-in requires libassbrew install ffmpeg (includes x264 and libass by default)conda install x264 -c conda-forge (⚠️ conda's ffmpeg does not include libx264 by default)sudo apt install ffmpeg libx264-dev libass-dev# Core
pip install manim
# Voiceover + TTS
pip install "manim-voiceover[gtts]"
pip install "manim-voiceover[pyttsx3]")pip install manim "manim-voiceover[gtts]"
# macOS (Homebrew) — Recommended, includes libx264 + libass
brew install ffmpeg
# macOS (Conda) — Requires additional x264 install, otherwise Manim render will fail with UnknownCodecError: libx264
conda install x264 -c conda-forge
# Verify ffmpeg supports libx264 and libass
ffmpeg -codecs 2>&1 | grep libx264 # Should show: encoders: libx264
ffmpeg -filters 2>&1 | grep subtitles # Should show: subtitles filter
After the user describes their requirements, use the pipeline script for one-click execution:
python3 {SKILL_DIR}/scripts/run_pipeline.py \
--scene_file <scene_file.py> \
--scene_name <SceneName> \
--quality high \
--burn_subtitles
Common Options:
| Option | Default | Description |
|---|---|---|
| -------- | --------- | ------------- |
--scene_file | Required | Manim scene Python file |
--scene_name | Required | Scene class name |
--quality | high | Render quality: low/medium/high/production |
--burn_subtitles | False | Whether to burn SRT subtitles with ffmpeg |
--speed | 1.35 | Playback speed multiplier (e.g., 1.35 means 1.35x speed; set to 1.0 to disable) |
--preview | False | Auto-open preview after rendering |
--output_dir | ./output | Output directory |
Based on the user's description, generate a Manim scene Python file. Scene code should follow these patterns:
No-voiceover mode (animation only):
from manim import *
class MyScene(Scene):
def construct(self):
title = Text("Title", font_size=48, color=BLUE)
self.play(Write(title))
self.wait(1)
Voiceover mode (animation + voice + subtitles):
from manim import *
from manim_voiceover import VoiceoverScene
from manim_voiceover.services.gtts import GTTSService
class MyScene(VoiceoverScene):
def _make_subtitle(self, text_str):
"""Create subtitle with dark background at bottom of screen."""
sub = Text(text_str, font_size=22, color=WHITE, weight=BOLD)
# Prevent subtitle from overflowing left/right edges
max_width = config.frame_width - 1.0 # 0.5 margin each side
if sub.width > max_width:
sub.scale_to_fit_width(max_width)
sub.to_edge(DOWN, buff=0.4)
bg = BackgroundRectangle(sub, color=BLACK, fill_opacity=0.6, buff=0.15)
return VGroup(bg, sub)
def construct(self):
self.set_speech_service(GTTSService(lang="en"))
sub_text = "Welcome to the demo"
with self.voiceover(text=sub_text) as tracker:
sub = self._make_subtitle(sub_text)
title = Text("Demo", font_size=48)
self.play(Write(title), FadeIn(sub), run_time=tracker.duration)
self.play(FadeOut(sub))
self.wait(0.3)
Key Pattern — voiceover context manager:
with self.voiceover(text="Speech text") as tracker:
# tracker.duration = TTS speech duration (seconds)
# Animations within this block auto-sync with voice
self.play(SomeAnimation(), run_time=tracker.duration)
with self.voiceover(text=...) as tracker does three things:
tracker.duration to sync animations with voiceSubtitle Best Practices:
_make_subtitle() helper to display white bold text with dark background at the bottom of the screen_make_subtitle() auto-detects subtitle width and scales proportionally (scale_to_fit_width) when exceeding frame bounds; uses font_size=22 for long textFadeIn(sub) in the first self.play() within the voiceover block ensures subtitles appear in sync with voice — do not delay⚠️ Avoid Double Subtitles: If the scene code already uses _make_subtitle() to render in-scene subtitles, do not also use --burn_subtitles to burn SRT subtitles, otherwise two overlapping subtitle layers will appear. Choose only one approach:
_make_subtitle(), do not burn SRT--burn_subtitlesCreate manim.cfg in the same directory as the scene file:
[CLI]
quality = high_quality
preview = False
[ffmpeg]
video_codec = h264
Quality Reference Table:
| Quality | Flag | Resolution | FPS | manim.cfg Value |
|---|---|---|---|---|
| --------- | ------ | ------------ | ----- | ----------------- |
| Low | -ql | 480p | 15 | low_quality |
| Medium | -qm | 720p | 30 | medium_quality |
| High | -qh | 1080p | 60 | high_quality |
| Production | -qp | 2160p | 60 | production_quality |
manim render <scene_file.py> <SceneName>
Output path pattern: media/videos/
manim-voiceover automatically generates .srt subtitle files in the same directory as the video. Burn with ffmpeg:
ffmpeg -y -i <video.mp4> \
-vf "subtitles=<subtitle.srt>:force_style='FontSize=22,PrimaryColour=&H00FFFFFF,OutlineColour=&H00000000,Outline=2,BackColour=&H80000000,BorderStyle=4,MarginV=30'" \
-c:a copy \
<output_subtitled.mp4>
⚠️ Double Subtitle Pitfall: If the scene Python code already renders in-scene subtitles with _make_subtitle(), do not also burn SRT subtitles, otherwise two overlapping subtitle layers will appear.
Note: ffmpeg requires libass support. On macOS, brew install ffmpeg typically includes it. Conda environments may require conda install x264 -c conda-forge.
Use ffmpeg to speed up the video, default 1.35x:
SPEED=1.35
ffmpeg -y -i <input.mp4> \
-filter_complex "[0:v]setpts=PTS/${SPEED}[v];[0:a]atempo=${SPEED}[a]" \
-map "[v]" -map "[a]" \
<output_fast.mp4>
Note: Speed-up should be the final output step. If the scene code has in-scene subtitles (_make_subtitle), the speed-up input should use the original video (not the SRT-burned version) to avoid double subtitles. The run_pipeline.py --speed parameter handles this logic automatically.
Write(text) — Write textCreate(mobject) — Draw shapesFadeIn(mobject) / FadeOut(mobject) — Fade in/outDrawBorderThenFill(mobject) — Draw border then fillTransform(source, target) — MorphReplacementTransform(source, target) — Replacement morphTransformMatchingShapes(source, target) — Shape-matching morphmobject.animate.to_edge(UP) — Move to edgemobject.animate.shift(RIGHT * 2) — Translatemobject.animate.scale(2) — ScaleRotate(mobject, angle=PI) — RotateText("Text", font_size=48, color=BLUE) — TextMathTex(r"e^{i\pi}+1=0") — LaTeX formulaCircle(radius=1, color=RED) — CircleSquare(side_length=2, color=GREEN) — SquareArrow(start, end) — ArrowNumberPlane() — Coordinate planeAxes(x_range, y_range) — AxesVGroup(obj1, obj2) — Vertical groupgroup.arrange(RIGHT, buff=0.5) — Horizontal arrangementBackgroundRectangle(obj, color=BLACK, fill_opacity=0.6) — Background rectangleSymptom: UnknownCodecError: libx264
Root Cause: Manim hardcodes the libx264 encoder in scene_file_writer.py (cannot be overridden via config/cfg), but conda environment's ffmpeg is compiled with --disable-gpl and does not include the GPL-licensed libx264.
Solution:
# Conda environment (most common scenario)
conda install x264 -c conda-forge
# After installation, conda-forge's ffmpeg will auto-relink the x264 library
# Verify
ffmpeg -codecs 2>&1 | grep libx264
# Output should include: encoders: libx264 libx264rgb
Note: brew install ffmpeg installs ffmpeg with built-in x264, but conda environments prioritize their own ffmpeg and will not use the Homebrew version.
manim-voiceover depends on pkg_resources, which may fail on Python 3.12+:
pip install "setuptools>=69.0,<72.0"
SRT subtitle burn-in requires libass. macOS:
brew install ffmpeg
# Verify
ffmpeg -filters 2>&1 | grep subtitles
Linux:
sudo apt install libass-dev
# May need to recompile ffmpeg
gTTS requires access to Google TTS service. If network is unavailable, switch to pyttsx3 offline engine:
from manim_voiceover.services.pyttsx3 import Pyttsx3Service
self.set_speech_service(Pyttsx3Service())
Manim uses system fonts to render Text objects. Ensure Chinese fonts are available:
sudo apt install fonts-noto-cjkText("Text", font="PingFang SC")references/manim_guide.md — Detailed Manim + voiceover + subtitle technical documentationscripts/check_environment.py — One-click dependency checkscripts/run_pipeline.py — One-click render + subtitle burn-in共 1 个版本