← 返回
内容创作 Key 中文

Wan 2.6 & 2.5 — AI Video & Image Generation by Alibaba

Generate AI videos and images using Alibaba's Wan 2.6 and Wan 2.5 — featuring text-to-video, image-to-video, video-to-video, text-to-image, and image editing...
使用阿里Wan 2.6和2.5生成AI视频与图像,支持文生视频、图生视频、视频生视频、文生图及图像编辑。
xixihhhh
内容创作 clawhub v1.0.2 2 版本 99891.7 Key: 需要
★ 0
Stars
📥 922
下载
💾 84
安装
2
版本
#latest

概述

Wan 2.6 & 2.5 — AI Video & Image Generation by Alibaba

Generate AI videos and images using Alibaba's Wan 2.6 and Wan 2.5 — featuring text-to-video, image-to-video, video-to-video, text-to-image, and image editing with up to 1080p resolution, 15-second duration, multi/single camera modes, audio-guided generation, and built-in prompt expansion.

Wan 2.6 is the latest flagship model with cinematic motion quality, multi-camera shot types, audio URL input for guided generation, and video-to-video transfer with character-level prompt control. Wan 2.5 offers a cost-effective alternative with 480p support and Flash variants for rapid prototyping.

> Data usage note: This skill sends text prompts, image URLs, audio URLs, and video files to the Atlas Cloud API (api.atlascloud.ai) for generation. No data is stored locally beyond the downloaded output files. API usage incurs charges based on resolution, duration, and model selected.


Key Capabilities

  • Text-to-Video — Generate video clips from text descriptions with optional audio (2.6 / 2.5)
  • Image-to-Video — Animate still images into dynamic video (2.6 / 2.5)
  • Video-to-Video — Transfer style or replace characters in existing videos using characterX prompt notation (2.6)
  • Text-to-Image — Generate images from text descriptions, 27 preset sizes (2.6)
  • Image Editing — Edit images with prompt-based instructions, up to 4 reference images (2.6)
  • Audio-Guided Generation — Provide an audio URL to guide video generation with synchronized sound (2.6)
  • Multi/Single Cameramulti_camera for dynamic shots, single_camera for stable framing (2.6)
  • Prompt Expansion — Built-in prompt optimization for better results
  • Up to 1080p — Resolutions: 480p, 720p, 1080p
  • Up to 15s — Duration: 5/10/15 seconds (2.6), 5/10 seconds (2.5)
  • Flash Variants — Fast, budget-friendly generation for drafts (2.6 I2V Flash, 2.5 Fast)

Setup

  1. Sign up at https://www.atlascloud.ai
  2. Console → API Keys → Create new key
  3. Set env: export ATLASCLOUD_API_KEY="your-key"

The API key is tied to your Atlas Cloud account and its pay-as-you-go balance. All usage is billed to this account. Atlas Cloud does not currently support scoped keys — the key grants access to all models available on your account.


Script Usage

This skill includes Python scripts for both video and image generation. Zero external dependencies required.

List available models

python scripts/generate_video.py list-models
python scripts/generate_image.py list-models

Generate a video

python scripts/generate_video.py generate \
  --model "alibaba/wan-2.6/text-to-video" \
  --prompt "Your prompt here" \
  --output ./output \
  duration=5

Generate an image

python scripts/generate_image.py generate \
  --model "alibaba/wan-2.6/text-to-image" \
  --prompt "Your prompt here" \
  --output ./output

Image-to-video

python scripts/generate_video.py generate \
  --model "alibaba/wan-2.6/image-to-video" \
  --image "https://example.com/photo.jpg" \
  --prompt "Animate this scene" \
  --output ./output \
  resolution=1080p duration=5

Upload a local file

python scripts/generate_video.py upload ./local-file.jpg

Run python scripts/generate_video.py generate --help or python scripts/generate_image.py generate --help for all options. Extra model params can be passed as key=value.


Pricing

Wan 2.6 — Video Models (per second, by resolution)

All video prices are per second of video generated. Atlas Cloud pricing varies by resolution.

Text-to-Video & Video-to-Video

Resolutionfal.aiAtlas CloudSavings
:----------::------::-----------::-------:
480p-$0.04/s-
720p$0.10/s$0.08/s20% off
1080p$0.15/s$0.12/s20% off

Image-to-Video

Resolutionfal.aiAtlas CloudSavings
:----------::------::-----------::-------:
720p$0.10/s$0.10/s-
1080p$0.15/s$0.15/s-

Image-to-Video Flash

ResolutionAtlas Cloud
:----------::-----------:
All$0.018/s

Wan 2.6 — Image Models (per image)

ModelOriginalAtlas CloudSavings
-------:--------::-----------::-------:
Text-to-Image~~$0.03~~$0.02130% off
Image Edit~~$0.035~~$0.02140% off

Wan 2.5 — Video Models (per second, flat rate)

ModelAtlas CloudDuration
-------:-----------:----------
Text-to-Video$0.035/s5/10 seconds
Image-to-Video$0.035/s5/10 seconds

> fal.ai pricing sourced from fal.ai/models/wan.


Available Models

Wan 2.6 Video

Model IDTypeResolutionDuration
----------------:----------::--------:
alibaba/wan-2.6/text-to-videoText-to-Video480p–1080p5/10/15s
alibaba/wan-2.6/image-to-videoImage-to-Video720p–1080p5/10/15s
alibaba/wan-2.6/image-to-video-flashImage-to-Video (Fast)720p–1080p5/10/15s
alibaba/wan-2.6/video-to-videoVideo-to-Video480p–1080p5/10s

Wan 2.6 Image

Model IDTypeMax Size
----------------:--------:
alibaba/wan-2.6/text-to-imageText-to-Image2184×936
alibaba/wan-2.6/image-editImage Editing24 presets

Wan 2.5 Video

Model IDTypeResolutionDuration
----------------:----------::--------:
alibaba/wan-2.5/text-to-videoText-to-Video480p–1080p5/10s
alibaba/wan-2.5/image-to-videoImage-to-Video480p–1080p5/10s

Parameters

Wan 2.6 — Text-to-Video

ParameterTypeRequiredDefaultDescription
-------------------------------------------------
promptstringYes-Video description
negative_promptstringNo-What to exclude from the video
sizestringNo1280*720Output size (see Size Options below)
durationintegerNo55, 10, or 15 seconds
shot_typestringNo-multi_camera for dynamic shots, single_camera for stable framing
audiostringNo-Audio URL to guide generation with synchronized sound
generate_audiobooleanNofalseGenerate synchronized audio
enable_prompt_expansionbooleanNofalseExpand prompt for better results
seedintegerNorandomFor reproducible results

Wan 2.6 — Image-to-Video

Same as text-to-video (without size), plus:

ParameterTypeRequiredDefaultDescription
-------------------------------------------------
imagestringYes-Source image URL
resolutionstringNo720p720p, 1080p

Wan 2.6 — Video-to-Video

ParameterTypeRequiredDefaultDescription
-------------------------------------------------
promptstringYes-Video description (use character1, character2 to reference characters in video)
negative_promptstringNo-What to exclude
videosarrayYes-Source video URLs (max 100MB each, 2-30s duration)
sizestringNo1280*720Output size
durationintegerNo55 or 10 seconds
shot_typestringNo-multi_camera or single_camera
enable_prompt_expansionbooleanNofalseExpand prompt for better results
seedintegerNorandomFor reproducible results

Wan 2.6 — Text-to-Image

ParameterTypeRequiredDefaultDescription
-------------------------------------------------
promptstringYes-Image description
negative_promptstringNo-What to exclude
sizestringNo1024*1024Output size (27 presets, see below)
enable_prompt_expansionbooleanNofalseExpand prompt
enable_sync_modebooleanNofalseWait for result synchronously
enable_base64_outputbooleanNofalseReturn Base64 instead of URL
seedintegerNorandomFor reproducible results

Wan 2.6 — Image Edit

Same as text-to-image, plus:

ParameterTypeRequiredDefaultDescription
-------------------------------------------------
imagesarrayYes-Images to edit (max 4, 384-5000px per side)

Wan 2.5 — Text-to-Video

ParameterTypeRequiredDefaultDescription
-------------------------------------------------
promptstringYes-Video description
negative_promptstringNo-What to exclude
sizestringNo1280*720Output size (13 presets, including 480p)
durationintegerNo55 or 10 seconds
audiostringNo-Audio URL for guided generation
generate_audiobooleanNofalseGenerate synchronized audio
enable_prompt_expansionbooleanNofalseExpand prompt
seedintegerNorandomFor reproducible results

Wan 2.5 — Image-to-Video

Same as Wan 2.5 text-to-video (without size), plus:

ParameterTypeRequiredDefaultDescription
-------------------------------------------------
imagestringYes-Source image URL
resolutionstringNo720p480p, 720p, 1080p

Video Size Options (Wan 2.6)

T2V / V2V (10 presets):

1280720, 7201280, 960960, 19201080, 10801920, 1280960, 9601280, 1920816, 8161920, 1280544

Image Size Options (Wan 2.6 — 27 presets):

10241024, 1280720, 7201280, 1280960, 9601280, 15361024, 10241536, 12801280, 15361536, 20481024, 10242048, 15361280, 12801536, 1680720, 7201680, 2016864, 8642016, 1536864, 8641536, 2184936, 9362184, 14001050, 10501400, 16801050, 10501680, 11761176, 1560*1560


Workflow: Submit → Poll → Download

Text-to-Video Example (Wan 2.6)

# Step 1: Submit
curl -s -X POST "https://api.atlascloud.ai/api/v1/model/generateVideo" \
  -H "Authorization: Bearer $ATLASCLOUD_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "alibaba/wan-2.6/text-to-video",
    "prompt": "A drone shot flying over ancient ruins at golden hour, camera slowly descending toward a central courtyard",
    "size": "1920*1080",
    "duration": 10,
    "shot_type": "multi_camera",
    "generate_audio": true,
    "enable_prompt_expansion": true
  }'
# Returns: { "code": 200, "data": { "id": "prediction-id" } }

# Step 2: Poll (every 5 seconds until completed)
curl -s "https://api.atlascloud.ai/api/v1/model/prediction/{prediction-id}" \
  -H "Authorization: Bearer $ATLASCLOUD_API_KEY"
# Returns: { "code": 200, "data": { "status": "completed", "outputs": ["https://...video-url..."] } }

# Step 3: Download
curl -o output.mp4 "VIDEO_URL_FROM_OUTPUTS"

Image-to-Video Example (Wan 2.6)

curl -s -X POST "https://api.atlascloud.ai/api/v1/model/generateVideo" \
  -H "Authorization: Bearer $ATLASCLOUD_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "alibaba/wan-2.6/image-to-video",
    "image": "https://example.com/landscape.jpg",
    "prompt": "The camera slowly zooms in as clouds drift across the sky and leaves rustle in the wind",
    "resolution": "1080p",
    "duration": 5,
    "generate_audio": true
  }'

Video-to-Video Example (Wan 2.6)

curl -s -X POST "https://api.atlascloud.ai/api/v1/model/generateVideo" \
  -H "Authorization: Bearer $ATLASCLOUD_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "alibaba/wan-2.6/video-to-video",
    "videos": ["https://example.com/original-video.mp4"],
    "prompt": "Transform character1 into a cartoon anime character, keep the background unchanged",
    "size": "1280*720",
    "duration": 5,
    "shot_type": "single_camera"
  }'

Audio-Guided Generation Example (Wan 2.6)

curl -s -X POST "https://api.atlascloud.ai/api/v1/model/generateVideo" \
  -H "Authorization: Bearer $ATLASCLOUD_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "alibaba/wan-2.6/text-to-video",
    "prompt": "A jazz band performing on stage, musicians playing saxophone and piano",
    "audio": "https://example.com/jazz-music.mp3",
    "size": "1920*1080",
    "duration": 10
  }'

Text-to-Image Example (Wan 2.6)

curl -s -X POST "https://api.atlascloud.ai/api/v1/model/generateImage" \
  -H "Authorization: Bearer $ATLASCLOUD_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "alibaba/wan-2.6/text-to-image",
    "prompt": "A cyberpunk cityscape at night, neon signs reflected in rain puddles, photorealistic",
    "size": "1680*720",
    "enable_prompt_expansion": true
  }'
# Returns: { "code": 200, "data": { "id": "prediction-id" } }

Image Editing Example (Wan 2.6)

curl -s -X POST "https://api.atlascloud.ai/api/v1/model/generateImage" \
  -H "Authorization: Bearer $ATLASCLOUD_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "alibaba/wan-2.6/image-edit",
    "prompt": "Change the background to a sunset beach scene, keep the person unchanged",
    "images": ["https://example.com/photo.jpg"],
    "size": "1280*720"
  }'

Image-to-Video Flash Example (Wan 2.6)

curl -s -X POST "https://api.atlascloud.ai/api/v1/model/generateVideo" \
  -H "Authorization: Bearer $ATLASCLOUD_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "alibaba/wan-2.6/image-to-video-flash",
    "image": "https://example.com/portrait.jpg",
    "prompt": "The person slowly turns and smiles",
    "resolution": "720p",
    "duration": 5
  }'

Polling Logic

  • processing / starting / running → wait 5s, retry (typically takes ~30-120s for video, ~5-10s for image)
  • completed / succeeded → done, get URL from data.outputs[]
  • failed → error, read data.error

Atlas Cloud MCP Tools (if available)

If the Atlas Cloud MCP server is configured, use built-in tools:

atlas_generate_video(model="alibaba/wan-2.6/text-to-video", params={...})
atlas_generate_image(model="alibaba/wan-2.6/text-to-image", params={...})
atlas_get_prediction(prediction_id="...")

Implementation Guide

  1. Determine task type:
    • Text-to-video: user describes a scene/action in text → 2.6 T2V or 2.5 T2V
    • Image-to-video: user provides an image to animate → 2.6 I2V or 2.6 I2V Flash (budget)
    • Video-to-video: user wants to transform an existing video → 2.6 V2V
    • Text-to-image: user wants to generate an image → 2.6 T2I
    • Image editing: user wants to modify existing images → 2.6 Image Edit
  1. Choose model version:
    • Wan 2.6 (recommended): Latest generation, best quality, multi-camera, audio-guided, V2V
    • Wan 2.6 Flash: Budget I2V at $0.018/s — ideal for drafts
    • Wan 2.5: Cost-effective at $0.035/s flat rate, supports 480p
  1. Extract parameters:
    • Prompt: describe scene, action, camera movement
    • Negative prompt: exclude undesired elements ("blurry, distorted, watermark")
    • Resolution: 480p for drafts, 720p default, 1080p for final output
    • Duration: 5s default, up to 15s for 2.6, up to 10s for 2.5
    • Shot type (2.6 only): multi_camera for dynamic shots, single_camera for stable framing
    • Audio: provide URL for audio-guided generation, or set generate_audio: true
    • Prompt expansion: set enable_prompt_expansion: true for auto-optimized prompts
  1. Execute: POST to generateVideo/generateImage API → poll result → download
  1. Present result: show file path, offer to play/open

Prompt Tips

  • Scene + Action: "A samurai draws his sword in a bamboo forest at dawn, mist rising from the ground"
  • Camera direction: "Camera slowly dollies forward...", "Aerial tracking shot of...", "First-person POV walking through..."
  • Multi-camera: Use shot_type: "multi_camera" with prompts like "Cut between close-up and wide shot..."
  • V2V character control: Reference characters as character1, character2 — e.g., "Transform character1 into an anime character"
  • Audio-guided: Provide an audio URL to sync video with music, dialogue, or sound effects
  • Negative prompts: "blurry, low quality, distorted faces, watermark, text overlay"

Model Comparison

FeatureWan 2.6Wan 2.5
---------:-------::-------:
Video T2V Price (720p)$0.08/s$0.035/s
Video I2V Price (720p)$0.10/s$0.035/s
Max Resolution1080p1080p
Max Duration15s10s
Shot Type ControlYesNo
Audio-GuidedYesYes
Video-to-VideoYesNo
Text-to-ImageYes ($0.021)No
Image EditingYes ($0.021)No
Flash/Fast VariantsI2V Flash ($0.018/s)Yes
Prompt ExpansionYesYes

版本历史

共 2 个版本

  • v1.0.2 当前
    2026-03-29 08:57 安全 安全
  • v1.0.0
    2026-03-26 21:40

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

content-creation

Humanizer

biostartechnology
消除AI写作痕迹,使文本更自然真实。基于维基百科"AI写作特征"指南,识别并修正夸张象征、宣传用语、肤浅-ing分析、模糊归因、破折号滥用、三项排比、AI词汇、负面平行结构及冗长连接词等模式。
★ 860 📥 200,035
content-creation

AdMapix

fly0pants
广告情报与应用数据分析助手,支持搜索广告素材、分析应用排名、下载量、收入及市场洞察,用于广告素材和竞品分析。
★ 295 📥 136,524
content-creation

Baidu Wenku AIPPT

ide-rea
使用百度文库 AI 智能生成 PPT,自动根据内容选择模板。
★ 66 📥 46,226