概述

🎬 Video Generator Skill

Automated text-to-video generation system that transforms text scripts into professional short videos with AI-powered voiceover, precise timing, and cyber-wireframe visuals.

Cost: ~$0.003 per 15-second video | License: MIT | Package: openclaw-video-generator

📦 Package Information

Property	Value
----------	-------
npm Package	`openclaw-video-generator`
Version	1.6.2
Repository	github.com/ZhenRobotics/openclaw-video-generator
Commit Hash	`6279034`
License	MIT

Verification:

npm info openclaw-video-generator version repository.url
# Expected: 1.6.2 and https://github.com/ZhenRobotics/openclaw-video-generator

🔐 Provider Setup (Choose ONE)

This tool supports 4 alternative TTS/ASR providers. You only need ONE configured:

Option 1: OpenAI (Recommended)

export OPENAI_API_KEY="sk-..."

Pros: Best quality, simple setup
Cost: ~$0.003 per 15s video

Option 2: Azure

export AZURE_SPEECH_KEY="..."
export AZURE_SPEECH_REGION="eastasia"

Pros: Enterprise reliability
Cost: Similar to OpenAI

Option 3: Aliyun (阿里云)

export ALIYUN_ACCESS_KEY_ID="..."
export ALIYUN_ACCESS_KEY_SECRET="..."
export ALIYUN_APP_KEY="..."

Pros: China connectivity, Chinese voices
Cost: ~¥0.02 per 15s video

Option 4: Tencent (腾讯云)

export TENCENT_SECRET_ID="..."
export TENCENT_SECRET_KEY="..."
export TENCENT_APP_ID="..."

Pros: China connectivity
Cost: ~¥0.02 per 15s video

Why multiple providers? Fallback support for network restrictions, regional preferences, and cost optimization.

🚀 Quick Start

Prerequisites

node --version  # Need >= 18
npm --version
ffmpeg -version

Installation

Option 1: npm Global Install

npm install -g openclaw-video-generator@1.6.2
export OPENAI_API_KEY="sk-..."  # Or add to ~/.bashrc
openclaw-video-generator --version

Option 2: From Source

git clone https://github.com/ZhenRobotics/openclaw-video-generator.git
cd openclaw-video-generator
npm install

# Configure provider
cp .env.example .env
nano .env  # Add your API key
chmod 600 .env

First Video

cd ~/openclaw-video-generator
cat > test.txt << 'EOF'
AI makes development easier
Saving time and boosting efficiency
EOF

./scripts/script-to-video.sh test.txt --voice nova --speed 1.15
# Output: out/test.mp4

💻 Agent Usage

When to Use

Auto-trigger when user mentions: video, generate video, create video, 生成视频

Standard Command

cd ~/openclaw-video-generator && \
./scripts/script-to-video.sh <script-file> \
  --voice nova \
  --speed 1.15

With Background Video

cd ~/openclaw-video-generator && \
./scripts/script-to-video.sh <script-file> \
  --voice nova \
  --bg-video "backgrounds/tech.mp4" \
  --bg-opacity 0.6

Example Flow

User: "Generate video: AI makes development easier"

Agent:

# 1. Check project
ls ~/openclaw-video-generator || echo "Not installed"

# 2. Create script
cat > ~/openclaw-video-generator/scripts/user-script.txt << 'EOF'
AI makes development easier
EOF

# 3. Generate
cd ~/openclaw-video-generator && \
./scripts/script-to-video.sh scripts/user-script.txt

# 4. Show result
echo "Video: ~/openclaw-video-generator/out/user-script.mp4"

Guidelines

Do:

Verify project exists before running
Check .env configuration
Show output file location

Don't:

Clone without user confirmation
Hardcode API keys in commands
Create new Remotion projects

🎯 Core Features

Multi-Provider TTS: OpenAI, Azure, Aliyun, Tencent with auto-fallback
Timestamp Extraction: Precise speech-to-text segmentation
Scene Detection: 6 intelligent scene types with auto-styling
Video Rendering: Remotion with cyber-wireframe aesthetics
Background Videos: Custom backgrounds with opacity control
Local Processing: Video rendering happens on your machine

⚙️ Configuration

TTS Voices

OpenAI:

nova (recommended), alloy, echo, shimmer

Azure:

zh-CN-XiaoxiaoNeural, zh-CN-YunxiNeural

Speech Speed

Range: 0.25 - 4.0 | Recommended: 1.15

Background Video

--bg-video - Video file
--bg-opacity <0-1> - Transparency
--bg-overlay - Text overlay

Recommended:

Use Case	Opacity	Overlay
----------	---------	---------
Text-focused	0.3-0.4	`rgba(10,10,15,0.6)`
Balanced	0.5-0.6	`rgba(10,10,15,0.4)`
Visual-focused	0.7-1.0	`rgba(10,10,15,0.25)`

📊 Video Specs

Resolution: 1080 x 1920 (vertical)
Frame Rate: 30 fps
Format: MP4 (H.264 + AAC)
Style: Cyber-wireframe with neon colors
Duration: Auto-calculated

🎨 Scene Types

Type	Effect	Trigger
------	--------	---------
title	Glitch + scale	First segment
emphasis	Pop-up zoom	Numbers/percentages
pain	Shake + warning	Problems mentioned
content	Fade-in	Regular text
circle	Rotating ring	Listed points
end	Slide-up	Last segment

💰 Cost

Per 15-second video: ~$0.003 (< 1 cent)

TTS: ~$0.001
Whisper: ~$0.0015
Rendering: Free (local)

🔧 Troubleshooting

Project Not Found

ls ~/openclaw-video-generator || \
git clone https://github.com/ZhenRobotics/openclaw-video-generator.git ~/openclaw-video-generator && \
cd ~/openclaw-video-generator && npm install

API Key Error

# Verify .env
cat ~/openclaw-video-generator/.env

# Create if missing
cd ~/openclaw-video-generator
echo 'OPENAI_API_KEY="sk-..."' > .env
chmod 600 .env

Provider Test

cd ~/openclaw-video-generator && ./scripts/test-providers.sh

🔒 Privacy

Local Processing:

Video rendering
Scene orchestration
File management

Cloud Processing (via configured provider):

Text-to-Speech (text sent to API)
Speech recognition (audio sent to API)

API keys are stored in .env file (600 permissions, never committed to git).

📚 Documentation

npm: https://www.npmjs.com/package/openclaw-video-generator
GitHub: https://github.com/ZhenRobotics/openclaw-video-generator
Issues: https://github.com/ZhenRobotics/openclaw-video-generator/issues

📊 Tech Stack

Remotion · OpenAI · Azure · Aliyun · Tencent · TypeScript · Node.js · FFmpeg

🆕 Version History

v1.6.2 (2026-03-25) - Current

Chinese TTS integration (Aliyun)
Dual subtitle styles
Medical content examples

v1.6.0 (2026-03-18)

Premium styles system
Poster generator
Design tokens

v1.2.0 (2026-03-07)

Background video support
Multi-provider architecture
Auto-fallback

v1.0.0 (2026-03-03)

Initial release

License: MIT | Author: @ZhenStaff | Support: GitHub Issues

版本历史

共 5 个版本

v1.0.42 当前

2026-05-03 02:44 安全安全
v1.0.4

2026-03-29 20:54
v1.0.24

2026-03-27 20:02
v1.0.10

2026-03-14 01:01
v1.0.2

2026-03-11 11:44

安全检测

腾讯云安全 (Keen)

安全，无风险

查看报告

腾讯云安全 (Sanbu)