← 返回
内容创作 Key

Video Generator | 视频生成器

Automated text-to-video pipeline with multi-provider TTS/ASR support - OpenAI, Azure, Aliyun, Tencent | 多厂商 TTS/ASR 支持的自动化文本转视频系统
自动化文本转视频流水线,支持多厂商 TTS/ASR(OpenAI、Azure、阿里云、腾讯)
zhenstaff
内容创作 clawhub v1.0.42 5 版本 99843.9 Key: 需要
★ 1
Stars
📥 2,539
下载
💾 494
安装
5
版本
#latest

概述

🎬 Video Generator Skill

Automated text-to-video generation system that transforms text scripts into professional short videos with AI-powered voiceover, precise timing, and cyber-wireframe visuals.

Cost: ~$0.003 per 15-second video | License: MIT | Package: openclaw-video-generator


📦 Package Information

PropertyValue
-----------------
npm Packageopenclaw-video-generator
Version1.6.2
Repositorygithub.com/ZhenRobotics/openclaw-video-generator
Commit Hash6279034
LicenseMIT

Verification:

npm info openclaw-video-generator version repository.url
# Expected: 1.6.2 and https://github.com/ZhenRobotics/openclaw-video-generator

🔐 Provider Setup (Choose ONE)

This tool supports 4 alternative TTS/ASR providers. You only need ONE configured:

Option 1: OpenAI (Recommended)

export OPENAI_API_KEY="sk-..."
  • Pros: Best quality, simple setup
  • Cost: ~$0.003 per 15s video

Option 2: Azure

export AZURE_SPEECH_KEY="..."
export AZURE_SPEECH_REGION="eastasia"
  • Pros: Enterprise reliability
  • Cost: Similar to OpenAI

Option 3: Aliyun (阿里云)

export ALIYUN_ACCESS_KEY_ID="..."
export ALIYUN_ACCESS_KEY_SECRET="..."
export ALIYUN_APP_KEY="..."
  • Pros: China connectivity, Chinese voices
  • Cost: ~¥0.02 per 15s video

Option 4: Tencent (腾讯云)

export TENCENT_SECRET_ID="..."
export TENCENT_SECRET_KEY="..."
export TENCENT_APP_ID="..."
  • Pros: China connectivity
  • Cost: ~¥0.02 per 15s video

Why multiple providers? Fallback support for network restrictions, regional preferences, and cost optimization.


🚀 Quick Start

Prerequisites

node --version  # Need >= 18
npm --version
ffmpeg -version

Installation

Option 1: npm Global Install

npm install -g openclaw-video-generator@1.6.2
export OPENAI_API_KEY="sk-..."  # Or add to ~/.bashrc
openclaw-video-generator --version

Option 2: From Source

git clone https://github.com/ZhenRobotics/openclaw-video-generator.git
cd openclaw-video-generator
npm install

# Configure provider
cp .env.example .env
nano .env  # Add your API key
chmod 600 .env

First Video

cd ~/openclaw-video-generator
cat > test.txt << 'EOF'
AI makes development easier
Saving time and boosting efficiency
EOF

./scripts/script-to-video.sh test.txt --voice nova --speed 1.15
# Output: out/test.mp4

💻 Agent Usage

When to Use

Auto-trigger when user mentions: video, generate video, create video, 生成视频

Standard Command

cd ~/openclaw-video-generator && \
./scripts/script-to-video.sh <script-file> \
  --voice nova \
  --speed 1.15

With Background Video

cd ~/openclaw-video-generator && \
./scripts/script-to-video.sh <script-file> \
  --voice nova \
  --bg-video "backgrounds/tech.mp4" \
  --bg-opacity 0.6

Example Flow

User: "Generate video: AI makes development easier"

Agent:

# 1. Check project
ls ~/openclaw-video-generator || echo "Not installed"

# 2. Create script
cat > ~/openclaw-video-generator/scripts/user-script.txt << 'EOF'
AI makes development easier
EOF

# 3. Generate
cd ~/openclaw-video-generator && \
./scripts/script-to-video.sh scripts/user-script.txt

# 4. Show result
echo "Video: ~/openclaw-video-generator/out/user-script.mp4"

Guidelines

Do:

  • Verify project exists before running
  • Check .env configuration
  • Show output file location

Don't:

  • Clone without user confirmation
  • Hardcode API keys in commands
  • Create new Remotion projects

🎯 Core Features

  • Multi-Provider TTS: OpenAI, Azure, Aliyun, Tencent with auto-fallback
  • Timestamp Extraction: Precise speech-to-text segmentation
  • Scene Detection: 6 intelligent scene types with auto-styling
  • Video Rendering: Remotion with cyber-wireframe aesthetics
  • Background Videos: Custom backgrounds with opacity control
  • Local Processing: Video rendering happens on your machine

⚙️ Configuration

TTS Voices

OpenAI:

  • nova (recommended), alloy, echo, shimmer

Azure:

  • zh-CN-XiaoxiaoNeural, zh-CN-YunxiNeural

Speech Speed

Range: 0.25 - 4.0 | Recommended: 1.15

Background Video

  • --bg-video - Video file
  • --bg-opacity <0-1> - Transparency
  • --bg-overlay - Text overlay

Recommended:

Use CaseOpacityOverlay
----------------------------
Text-focused0.3-0.4rgba(10,10,15,0.6)
Balanced0.5-0.6rgba(10,10,15,0.4)
Visual-focused0.7-1.0rgba(10,10,15,0.25)

📊 Video Specs

  • Resolution: 1080 x 1920 (vertical)
  • Frame Rate: 30 fps
  • Format: MP4 (H.264 + AAC)
  • Style: Cyber-wireframe with neon colors
  • Duration: Auto-calculated

🎨 Scene Types

TypeEffectTrigger
-----------------------
titleGlitch + scaleFirst segment
emphasisPop-up zoomNumbers/percentages
painShake + warningProblems mentioned
contentFade-inRegular text
circleRotating ringListed points
endSlide-upLast segment

💰 Cost

Per 15-second video: ~$0.003 (< 1 cent)

  • TTS: ~$0.001
  • Whisper: ~$0.0015
  • Rendering: Free (local)

🔧 Troubleshooting

Project Not Found

ls ~/openclaw-video-generator || \
git clone https://github.com/ZhenRobotics/openclaw-video-generator.git ~/openclaw-video-generator && \
cd ~/openclaw-video-generator && npm install

API Key Error

# Verify .env
cat ~/openclaw-video-generator/.env

# Create if missing
cd ~/openclaw-video-generator
echo 'OPENAI_API_KEY="sk-..."' > .env
chmod 600 .env

Provider Test

cd ~/openclaw-video-generator && ./scripts/test-providers.sh

🔒 Privacy

Local Processing:

  • Video rendering
  • Scene orchestration
  • File management

Cloud Processing (via configured provider):

  • Text-to-Speech (text sent to API)
  • Speech recognition (audio sent to API)

API keys are stored in .env file (600 permissions, never committed to git).


📚 Documentation

  • npm: https://www.npmjs.com/package/openclaw-video-generator
  • GitHub: https://github.com/ZhenRobotics/openclaw-video-generator
  • Issues: https://github.com/ZhenRobotics/openclaw-video-generator/issues

📊 Tech Stack

Remotion · OpenAI · Azure · Aliyun · Tencent · TypeScript · Node.js · FFmpeg


🆕 Version History

v1.6.2 (2026-03-25) - Current

  • Chinese TTS integration (Aliyun)
  • Dual subtitle styles
  • Medical content examples

v1.6.0 (2026-03-18)

  • Premium styles system
  • Poster generator
  • Design tokens

v1.2.0 (2026-03-07)

  • Background video support
  • Multi-provider architecture
  • Auto-fallback

v1.0.0 (2026-03-03)

  • Initial release

License: MIT | Author: @ZhenStaff | Support: GitHub Issues

版本历史

共 5 个版本

  • v1.0.42 当前
    2026-05-03 02:44 安全 安全
  • v1.0.4
    2026-03-29 20:54
  • v1.0.24
    2026-03-27 20:02
  • v1.0.10
    2026-03-14 01:01
  • v1.0.2
    2026-03-11 11:44

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

content-creation

Humanizer

biostartechnology
消除AI写作痕迹,使文本更自然真实。基于维基百科"AI写作特征"指南,识别并修正夸张象征、宣传用语、肤浅-ing分析、模糊归因、破折号滥用、三项排比、AI词汇、负面平行结构及冗长连接词等模式。
★ 857 📥 199,148
content-creation

Baidu Wenku AIPPT

ide-rea
使用百度文库 AI 智能生成 PPT,自动根据内容选择模板。
★ 66 📥 46,112
content-creation

AdMapix

fly0pants
广告情报与应用数据分析助手,支持搜索广告素材、分析应用排名、下载量、收入及市场洞察,用于广告素材和竞品分析。
★ 294 📥 136,382