Generate lip-sync videos by combining an image with a custom audio file using ComfyDeploy's UGC-MANUAL workflow.
UGC-Manual takes:
And produces a video where the person in the image lip-syncs to the audio.
Endpoint: https://api.comfydeploy.com/api/run/deployment/queue
Deployment ID: 075ce7d3-81a6-4e3e-ab0e-7a25edf601b5
| Input | Description | Formats |
|---|---|---|
| ------- | ------------- | --------- |
image | Image with a visible face | JPG, PNG |
input_audio | Audio file to lip-sync | MP3, WAV, OGG |
uv run ~/.clawdbot/skills/ugc-manual/scripts/generate.py \
--image "path/to/image.jpg" \
--audio "path/to/audio.mp3" \
--output "output-video.mp4"
uv run ~/.clawdbot/skills/ugc-manual/scripts/generate.py \
--image "https://example.com/image.jpg" \
--audio "https://example.com/audio.mp3" \
--output "result.mp4"
# 1. Convert Telegram voice message to MP3 (if needed)
ffmpeg -i voice.ogg -acodec libmp3lame -q:a 2 voice.mp3
# 2. Generate lip-sync video
uv run ugc-manual... --image face.jpg --audio voice.mp3 --output video.mp4
| Feature | UGC-Manual | VEED-UGC |
|---|---|---|
| --------- | ------------ | ---------- |
| Audio source | User provides | Generated from brief |
| Script | N/A | Auto-generated |
| Voice | User's recording | ElevenLabs TTS |
| Use case | Custom audio | Automated content |
ffmpeg installed on the system共 1 个版本