← 返回
未分类 Key 中文

Xpilot Ad Maker

Generate a 30-second cinematic ad video with consistent character, AI narration, brand overlays, and ambient music. Uses Vidu reference-to-video for characte...
生成30秒电影感广告视频,保持角色一致性,配备AI旁白、品牌标识叠加和氛围音乐,采用Vidu参考转视频技术实现角色渲染。
jytech2023 jytech2023 来源
未分类 clawhub v0.1.0 1 版本 100000 Key: 需要
★ 0
Stars
📥 294
下载
💾 0
安装
1
版本
#latest

概述

MedTravel Ad Maker

End-to-end pipeline that produces a polished 30-second medical-tourism ad video.

What it does

Given a destination (e.g. "Nanning, China"), a procedure (e.g. "dental implants"),

and a brand name, this skill generates a complete 30-second ad with:

  1. Character continuity — One AI-generated protagonist appears in all 4 shots

(uses Vidu's reference2video so the same person shows up in every scene

without per-shot drift).

  1. Cinematic visuals — 4 storyboarded shots:
    • Pain point (high cost in patient's home country)
    • Modern destination clinic
    • Wellness recovery in scenic location
    • Triumphant outcome with brand CTA
  2. AI narration — Replicate Kokoro TTS (af_bella voice) generates

per-shot voiceover, time-aligned to each scene.

  1. Background music — Soft synthesized ambient pad (C-major triad,

low-pass filtered, fade in/out).

  1. Brand overlays — Top descriptive captions (so viewers understand the

story instantly) + bottom emerald-green brand text on each shot.

  1. Output — Final MP4 uploaded to your Cloudflare R2 bucket, plus all

intermediate clips for re-use.

How it works

Step 1: Wavespeed (Seedream 4.5) → 1 protagonist portrait → R2
Step 2: Vidu reference2video × 4 (parallel)  → 4 shot clips → R2
Step 3: Replicate Kokoro TTS × 4              → 4 narration clips
Step 4: ffmpeg concat                          → 30s silent video
Step 5: ffmpeg filter_complex                  → drawtext overlays + audio mix
Step 6: Upload final to R2

Cost & timing

Per run (one full 30s ad):

ItemCost
------
Wavespeed Seedream 4.5 (1 portrait)~$0.04
Vidu viduq2-pro reference2video × 4~$2.50 (250 credits)
Replicate Kokoro TTS × 4~$0.001
Total~$2.55

End-to-end runtime: ~3 minutes (most time is Vidu video generation in parallel).

Required environment variables

  • VIDU_API_KEY — Vidu Platform API key (https://platform.vidu.com)
  • WAVESPEED_API_KEY — Wavespeed.ai API key (for the protagonist image)
  • REPLICATE_API_KEY — Replicate token (for Kokoro TTS)
  • R2_ACCOUNT_ID, R2_ACCESS_KEY_ID, R2_SECRET_ACCESS_KEY,

R2_BUCKET_NAME, R2_PUBLIC_URL — Cloudflare R2 (S3-compatible) for storage

Required system binaries

  • node (≥ 18)
  • ffmpeg is bundled via the ffmpeg-static npm package — no system install needed.

Usage

# Customize the SHOTS array in make-xpilot-ad.ts with your storyboard,
# then run:
npx tsx make-xpilot-ad.ts

The script prints the final R2 URL at the end. To iterate on post-production

(captions, narration, music) without re-spending Vidu credits, run:

npx tsx xpilot-ad-finalize.ts

This pulls the existing 4 video clips from R2, regenerates narration, and

re-composites the final video. Free and fast (~45 seconds).

Example output

Final 30-second ad (8 MB MP4) — narration, ambient music, brand overlays:

https://pub-22e3d3e3f43e400493bbd71306cae6bb.r2.dev/demo/medical-tourism-ad/v2/medtravel-final.mp4

Behind-the-scenes assets (all publicly hosted on R2):

  • Protagonist reference image (Wavespeed Seedream 4.5):

https://pub-22e3d3e3f43e400493bbd71306cae6bb.r2.dev/demo/medical-tourism-ad/v2/reference-protagonist.png

  • Shot 1 — Sticker shock:

https://pub-22e3d3e3f43e400493bbd71306cae6bb.r2.dev/demo/medical-tourism-ad/v2/shot-1-sticker-shock.mp4

  • Shot 2 — Nanning clinic:

https://pub-22e3d3e3f43e400493bbd71306cae6bb.r2.dev/demo/medical-tourism-ad/v2/shot-2-nanning-clinic.mp4

  • Shot 3 — Bama wellness:

https://pub-22e3d3e3f43e400493bbd71306cae6bb.r2.dev/demo/medical-tourism-ad/v2/shot-3-bama-wellness.mp4

  • Shot 4 — Detian triumph:

https://pub-22e3d3e3f43e400493bbd71306cae6bb.r2.dev/demo/medical-tourism-ad/v2/shot-4-detian-triumph.mp4

Notice the same protagonist appears in all 4 shots — that's the power of

Vidu's reference2video mode, which this skill encapsulates.

Customization

To make this skill work for a different brand/vertical (e.g., "Mexican dental

tourism", "Thai cosmetic surgery", "Korean LASIK"), edit:

  • REFERENCE_PROMPT — describe your protagonist
  • SHOTS[*].prompt — describe each scene
  • SHOTS[*].narration — what the voiceover says
  • SHOTS[*].brandText — bottom brand caption
  • SHOTS[*].topCaption — top descriptive caption

The pipeline (parallel submission, polling, R2 mirroring, ffmpeg composition)

stays the same.

Why Reference-to-Video?

Vidu has three video generation modes:

ModeProsCons
---------
text2videoSimpleEach shot's character looks different
img2videoVisual continuityHard to change scenes (just continues motion)
reference2videoSame character across scenesSlightly more setup

For multi-shot ads with a recurring protagonist, reference2video is the

only mode that works. This skill encapsulates that workflow.

Known gotchas (saved you the debugging time)

  1. Vidu CloudFront URLs contain unencoded ; — don't URL-encode it,

that breaks the signature. Mirror to R2 immediately.

  1. OpenAI / OpenRouter quotas run out fast — this skill uses Replicate

Kokoro instead, which is dirt cheap.

  1. Replicate rate-limits accounts under $5 credit to 6 req/min — script

adds 11s delays between TTS calls.

  1. ffmpeg drawtext apostrophe escaping is unreliable — use full words

instead ("should not" instead of "shouldn't").

  1. ffmpeg drawtext % is parsed as variable — escape or use words ("60 percent").
  2. Multiple drawtext filters with commas in text break with , separator —

use ; + intermediate labels instead.

版本历史

共 1 个版本

  • v0.1.0 当前
    2026-05-07 19:57 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

design-media

Video Frames

steipete
使用 ffmpeg 从视频中提取帧或短片。
★ 137 📥 53,246
design-media

Nano Banana Pro

steipete
使用 Nano Banana Pro (Gemini 3 Pro Image) 生成或编辑图像。支持文生图、图生图及 1K/2K/4K 分辨率,适用于图像创建、修改及编辑请求,使用 --input-image 指定输入图像。
★ 435 📥 118,001
design-media

Openai Whisper

steipete
使用 Whisper CLI 进行本地语音转文字(无需 API 密钥)
★ 335 📥 94,799