← 返回
未分类 Key

Aliyun Wan Digital Human

Use when generating talking, singing, or presentation videos from a single character image and audio with Alibaba Cloud Model Studio digital-human model `wan...
在利用阿里云 Model Studio 数字人模型 wan...,从单张角色图像和音频生成说话、唱歌或演示视频时使用。
cinience cinience 来源
未分类 clawhub v1.0.0 1 版本 99706.7 Key: 需要
★ 0
Stars
📥 340
下载
💾 2
安装
1
版本
#latest

概述

Category: provider

Model Studio Digital Human

Validation

mkdir -p output/aliyun-wan-digital-human
python -m py_compile skills/ai/video/aliyun-wan-digital-human/scripts/prepare_digital_human_request.py && echo "py_compile_ok" > output/aliyun-wan-digital-human/validate.txt

Pass criteria: command exits 0 and output/aliyun-wan-digital-human/validate.txt is generated.

Output And Evidence

  • Save normalized request payloads, chosen resolution, and task polling snapshots under output/aliyun-wan-digital-human/.
  • Record image/audio URLs and whether the input image passed detection.

Use this skill for image + audio driven speaking, singing, or presenting characters.

Critical model names

Use these exact model strings:

  • wan2.2-s2v-detect
  • wan2.2-s2v

Selection guidance:

  • Run wan2.2-s2v-detect first to validate the image.
  • Use wan2.2-s2v for the actual video generation job.

Prerequisites

  • China mainland (Beijing) only.
  • Set DASHSCOPE_API_KEY in your environment, or add dashscope_api_key to ~/.alibabacloud/credentials.
  • Input audio should contain clear speech or singing, and input image should depict a clear subject.

Normalized interface (video.digital_human)

Detect Request

  • model (string, optional): default wan2.2-s2v-detect
  • image_url (string, required)

Generate Request

  • model (string, optional): default wan2.2-s2v
  • image_url (string, required)
  • audio_url (string, required)
  • resolution (string, optional): 480P or 720P
  • scenario (string, optional): talk, sing, or perform

Response

  • task_id (string)
  • task_status (string)
  • video_url (string, when finished)

Quick start

python skills/ai/video/aliyun-wan-digital-human/scripts/prepare_digital_human_request.py \
  --image-url "https://example.com/anchor.png" \
  --audio-url "https://example.com/voice.mp3" \
  --resolution 720P \
  --scenario talk

Operational guidance

  • Use a portrait, half-body, or full-body image with a clear face and stable framing.
  • Match audio length to the desired output duration; the output follows the audio length up to the model limit.
  • Keep image and audio as public HTTP/HTTPS URLs.
  • If the image fails detection, do not proceed directly to video generation.

Output location

  • Default output: output/aliyun-wan-digital-human/request.json
  • Override base dir with OUTPUT_DIR.

References

  • references/sources.md

版本历史

共 1 个版本

  • v1.0.0 当前
    2026-05-07 05:46 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

design-media

Volcengine Ai Image Generation

cinience
火山引擎AI服务图像生成工作流。适用于文生图、风格变体、提示词优化、确定性图像生成参数设置及问题排查。
★ 3 📥 4,550
it-ops-security

Alicloud Ai Content Aimiaobi

cinience
使用OpenAPI/SDK管理阿里云全秒(AIMiaoBi),在用户请求阿里云秒币内容操作(如列出资源)时使用。
★ 0 📥 1,909
design-media

Volcengine Ai Video Generation

cinience
火山引擎AI视频生成工作流。适用于文字生成视频、图片生成视频、生成参数调整及视频任务异步排查。
★ 0 📥 2,213