← 返回
内容创作 Key 中文

AI UGC Video Pipeline

End-to-end AI UGC video pipeline. Product info → GPT-4o-mini script → ElevenLabs voiceover → Aurora talking head (fal-ai/creatify/aurora) → Kling 2.6 Pro pro...
端到端AI UGC视频制作流程。产品信息 → GPT-4o-mini 脚本 → ElevenLabs 配音 → Aurora 说话头像 (fal‑ai/creatify/aurora) → Kling 2.6 Pro...
zero2ai-hub
内容创作 clawhub v1.2.0 1 版本 100000 Key: 需要
★ 0
Stars
📥 562
下载
💾 37
安装
1
版本
#latest

概述

skill-ugc-pipeline v1.2.0

Build your own MakeUGC pipeline. Direct API access — no $300/mo enterprise tier.

Default model: Aurora (fal-ai/creatify/aurora) — locked after A/B test.

Aurora produces significantly more realistic lip sync and narration vs alternatives.

What's new in v1.2.0

  • B-roll splicing — Kling 2.6 Pro image-to-video generates a cinematic product shot, spliced into the avatar video at a configurable timecode
  • UGC filter — grain + handheld shake applied to avatar segments ONLY; product B-roll stays clean and cinematic
  • Continuous audio across splice points (no audio gap)

Architecture

product info → [GPT-4o-mini] → script
                             → [ElevenLabs] → audio.mp3
avatar image + audio         → [fal.ai Aurora] → avatar.mp4
product image                → [Kling 2.6 Pro] → broll.mp4
avatar + broll + ugc-filter  → [ffmpeg] → final.mp4
audio.mp3                    → [OpenAI Whisper] → captions overlay

Full Pipeline (6 steps)

1. Script       GPT-4o-mini → spoken script
2. Voice        ElevenLabs → audio.mp3
3. Avatar       fal-ai/creatify/aurora → talking head MP4
4. B-roll       Kling 2.6 Pro image-to-video → product shot
5. Splice       ffmpeg: avatar(hook) + broll + avatar(resume) + continuous audio
6. Captions     Whisper word-level → overlay.py → final MP4
7. UGC filter   grain + handheld shake on avatar ONLY (product shot stays clean)

Quick Start — Full Pipeline

cd skill-ugc-pipeline
npm install

# Step 1-3: Script + voice + avatar
node scripts/generate.js \
  --product "Rain Cloud Humidifier" \
  --product-desc "USB cool mist humidifier. 300ml tank, LED glow, silent mode." \
  --avatar avatars/my_avatar.png \
  --output output/ad_raincloud.mp4

# Step 4-5: Add B-roll + UGC filter
node scripts/broll.js \
  --avatar-video output/ad_raincloud_aurora.mp4 \
  --audio output/ad_raincloud_audio.mp3 \
  --product-image https://example.com/product.jpg \
  --product-name "Rain Cloud Humidifier" \
  --splice-at 4.5 \
  --broll-duration 5 \
  --ugc-filter \
  --output output/final.mp4

# Step 6: Whisper captions
node scripts/transcribe_captions.js \
  --audio output/ad_raincloud_audio.mp3 \
  --video output/final.mp4 \
  --output output/final_captioned.mp4

Scripts

ScriptDescription
---------------------
generate.jsMain pipeline: script → voice → Aurora talking head
broll.jsB-roll splice + optional UGC filter (grain + shake on avatar)
transcribe_captions.jsWhisper word-level caption overlay
aurora_only.jsGenerate Aurora talking head only (skip script/voice)
batch.jsRun pipeline for multiple products
product_in_hand.jsGenerate product-in-hand composite image

broll.js Options

FlagDefaultDescription
----------------------------
--avatar-videorequiredPath to Aurora talking head MP4
--audiorequiredOriginal voiceover MP3 (plays continuously)
--product-imagerequiredURL or local path to product image
--product-namerequiredProduct name (used in Kling prompt)
--splice-at4.5Seconds into avatar video where B-roll inserts
--broll-duration5B-roll duration: 5 or 10 seconds
--ugc-filteroffAdd grain + handheld shake to avatar (product stays clean)
--outputrequiredOutput MP4 path

Cost Estimate

StepModelCost/video
------------------------
ScriptGPT-4o-mini~$0.01
VoiceElevenLabs~$0.05
AvatarAurora (fal.ai)~$1.00
B-rollKling 2.6 Pro (fal.ai)~$0.40
CaptionsWhisper API~$0.01
Total~$1.47–1.75

Requirements

  • FAL_KEY — fal.ai API key (Aurora + Kling)
  • ELEVENLABS_API_KEY — ElevenLabs API key
  • OPENAI_API_KEY — OpenAI API key (GPT-4o-mini + Whisper)
  • ffmpeg — for video splicing and UGC filter
  • uv — for Python caption overlay (via skill-tiktok-ads-video)

Avatar Requirements

  • Portrait photo, face visible (no heavy makeup or accessories that obscure mouth)
  • Resolution: 512×512 minimum, 1024×1024 recommended
  • File format: PNG or JPG

版本历史

共 1 个版本

  • v1.2.0 当前
    2026-03-30 12:24 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

content-creation

Humanizer

biostartechnology
消除AI写作痕迹,使文本更自然真实。基于维基百科"AI写作特征"指南,识别并修正夸张象征、宣传用语、肤浅-ing分析、模糊归因、破折号滥用、三项排比、AI词汇、负面平行结构及冗长连接词等模式。
★ 859 📥 199,558
content-creation

Baidu Wenku AIPPT

ide-rea
使用百度文库 AI 智能生成 PPT,自动根据内容选择模板。
★ 66 📥 46,163
content-creation

AdMapix

fly0pants
广告情报与应用数据分析助手,支持搜索广告素材、分析应用排名、下载量、收入及市场洞察,用于广告素材和竞品分析。
★ 295 📥 136,453