概述

IMA Video AI — Video Generator

For complete API documentation, security details, all parameters, and Python examples, read SKILL-DETAIL.md.

⚠️ MANDATORY: You MUST read("SKILL-DETAIL.md") (full file — do NOT pass limit parameter) before your first video generation call. It contains the full API payload structure, error handling tables, and UX protocol that this summary omits. Skipping it or reading only part of it causes parameter errors and poor user experience.

Model ID Reference (CRITICAL)

Use exact model_id from this table. Do NOT infer from friendly names.

Friendly Name	model_id (t2v)	model_id (i2v)	Notes
---------------	---------------	----------------	-------
Wan 2.6	`wan2.6-t2v`	`wan2.6-i2v`	⚠️ -t2v/-i2v suffix
Kling O1	`kling-video-o1`	`kling-video-o1`	⚠️ video- prefix
Kling 2.6	`kling-v2-6`	`kling-v2-6`	⚠️ v prefix
Hailuo 2.3	`MiniMax-Hailuo-2.3`	`MiniMax-Hailuo-2.3`	⚠️ MiniMax- prefix
Hailuo 2.0	`MiniMax-Hailuo-02`	`MiniMax-Hailuo-02`	⚠️ 02 not 2.0
Vidu Q2	`viduq2`	`viduq2-pro`	⚠️ Different for t2v/i2v
Google Veo 3.1	`veo-3.1-generate-preview`	`veo-3.1-generate-preview`	⚠️ -generate-preview suffix
Sora 2 Pro	`sora-2-pro`	`sora-2-pro`	✅ Straightforward
Pixverse V5.5	`pixverse`	`pixverse`	✅ Same as friendly name
SeeDance 1.5 Pro	`doubao-seedance-1.5-pro`	`doubao-seedance-1.5-pro`	⚠️ doubao- prefix

User input aliases: 万/Wan → wan2.6-* · 可灵/Kling → kling-video-o1 · 海螺/Hailuo → MiniMax-Hailuo-2.3 · Veo/Google Veo → veo-3.1-generate-preview

Video Generation Modes

User intent	task_type	When to use
-------------	-----------	-------------
Text only, no image	`text_to_video`	"生成一段…视频" / "text to video"
Image as first frame	`image_to_video`	"把这张图动起来" / "图生视频"
Image as visual reference (not first frame)	`reference_image_to_video`	"参考这张图生成" / "像这张风格"
Two images (start + end)	`first_last_frame_to_video`	"首帧+尾帧" / "从A过渡到B"

Visual Consistency (IMPORTANT)

If user mentions "same character", "series", "multi-shot", or continues from a previous generation:

Do NOT use text_to_video (will produce different-looking results)
Use image_to_video or reference_image_to_video with previous result as reference
Read ima-knowledge-ai/references/visual-consistency.md if available

Pre-Check: Knowledge Base

If ima-knowledge-ai is installed, read before generating:

ima-knowledge-ai/references/video-modes.md — understand mode differences
ima-knowledge-ai/references/visual-consistency.md — if multi-shot or character continuity needed

If not installed: use this SKILL's model table and defaults.

Model Selection Priority

User preference (if explicitly stated) → highest priority
ima-knowledge-ai recommendation (if installed)
Fallback defaults:

Task	Default Model	model_id	Cost
------	--------------	----------	------
text_to_video	Wan 2.6	`wan2.6-t2v`	25 pts
text_to_video (premium)	Hailuo 2.3	`MiniMax-Hailuo-2.3`	38 pts
text_to_video (budget)	Vidu Q2	`viduq2`	5 pts
image_to_video	Wan 2.6	`wan2.6-i2v`	25 pts
first_last_frame	Kling O1	`kling-video-o1`	48 pts
reference_image	Kling O1	`kling-video-o1`	48 pts

User Input Parsing

Duration: 5秒→5 · 10秒→10 · 15秒→15 · 1分钟→use max (tell user "当前最长15秒")

Aspect ratio: 横屏/16:9→16:9 · 竖屏/9:16→9:16 · 方形/1:1→1:1

Resolution: 720P/1080P/4K (if model supports)

Budget: 最便宜→Vidu Q2 (5pts) · 最好→Kling O1 or Veo 3.1

Script Usage

# Text to video
python3 {baseDir}/scripts/ima_video_create.py \
  --api-key $IMA_API_KEY \
  --task-type text_to_video \
  --model-id wan2.6-t2v \
  --prompt "a puppy runs across a sunny meadow" \
  --user-id {user_id} \
  --output-json

# Image to video (accepts URLs and local file paths)
python3 {baseDir}/scripts/ima_video_create.py \
  --api-key $IMA_API_KEY \
  --task-type image_to_video \
  --model-id wan2.6-i2v \
  --prompt "camera slowly zooms in" \
  --input-images https://example.com/photo.jpg \
  --user-id {user_id} \
  --output-json

# First-last frame (exactly 2 images required)
python3 {baseDir}/scripts/ima_video_create.py \
  --api-key $IMA_API_KEY \
  --task-type first_last_frame_to_video \
  --model-id kling-video-o1 \
  --prompt "smooth transition" \
  --input-images first.jpg last.jpg \
  --user-id {user_id} \
  --output-json

Sending Results to User

# ✅ CORRECT: Use remote URL directly for inline video playback
message(action="send", media=video_url, caption="✅ 视频生成成功！\n• 模型：[Name]\n• 耗时：[X]s\n• 积分：[N pts]")

# Then send link for copying
message(action="send", message=f"🔗 视频链接：\n{video_url}")

# ❌ WRONG: Never download to local file (shows as attachment, not playable)

UX Protocol (Brief)

Pre-generation: "🎬 开始生成视频… 模型：[Name]，预计[X~Y]秒，消耗[N]积分"
Progress: Every 30-60s: "⏳ 视频生成中… [P]%" (cap at 95% until API returns success)
Success: Send video via media=video_url + send link text for sharing
Failure: Natural language error + suggest alternative models. Never show technical errors to users. See SKILL-DETAIL.md for full error translation table.

Never say to users: script names, API endpoints, attribute_id, technical parameter names. Only: model name · time · credits · result · status.

Sora 2 Pro Content Policy

Sora has strict OpenAI content safety: ❌ people, celebrities, IP assets. ✅ landscapes, abstract, animals, nature. If rejected, suggest Wan 2.6 or Kling O1.

Environment

Base URL: https://api.imastudio.com

Headers: Authorization: Bearer $IMA_API_KEY · x-app-source: ima_skills · x_app_language: en

Core Flow

GET /open/v1/product/list?app=ima&platform=web&category= → get attribute_id, credit, model_version, form_config
[Image tasks only] Upload images or pass local paths to script
POST /open/v1/tasks/create → get task_id
POST /open/v1/tasks/detail → poll every 8s until resource_status==1

MANDATORY: Always query product list first. attribute_id is required — if 0 or missing, task fails.

User Preference Memory

Storage: ~/.openclaw/memory/ima_prefs.json

Save when user explicitly says "用XXX" / "默认用XXX" / "always use XXX"
Clear when user says "用最好的" / "推荐一个" / "自动选择"
Never save auto-selected or fallback models as preferences

Pixverse Special Case (v1.0.7+)

Pixverse V5.5/V5/V4 lack model in form_config. Script auto-infers from model_name (e.g. "Pixverse V5.5" → model: "v5.5"). No manual action needed.

⚠️ REMINDER: read("SKILL-DETAIL.md") (full file, no limit) is required before generating video. This file is a summary — SKILL-DETAIL.md has the complete API reference, error translation table, and UX protocol you need for correct execution.

版本历史

共 1 个版本

v1.0.6 当前

2026-05-03 02:44 安全安全

安全检测

腾讯云安全 (Keen)

安全，无风险

查看报告

腾讯云安全 (Sanbu)