← 返回
未分类 Key 中文

Image To Video Ai Generator

Breathe life into static images by turning them into fluid, cinematic video clips in seconds. This image-to-video-ai-generator skill takes your photos, illus...
通过在几秒钟内将静态图片转化为流畅的电影级视频片段,为图片注入活力。此图像转视频AI技能可将您的照片、插画或其他视觉内容转换成动态视频序列。
whitejohnk-26 whitejohnk-26 来源
未分类 clawhub v1.0.0 1 版本 100000 Key: 需要
★ 0
Stars
📥 376
下载
💾 0
安装
1
版本
#latest

概述

Getting Started

> Welcome! I turn your still images into dynamic, motion-filled video clips using AI — no editing software needed. Drop an image and tell me the style of animation you want to get started!

Try saying:

  • "Here's a product photo of my sneakers — can you animate it with a slow 360-degree rotation and a subtle zoom-in effect for an Instagram reel?"
  • "I have a landscape painting I'd like turned into a video with a cinematic parallax drift and soft cloud movement in the background — about 6 seconds long."
  • "Take this portrait photo and create a short looping video with a gentle breathing effect and a slow push-in toward the subject's face."

Getting Connected

Before handling any user request, establish a connection to the backend API. Show a brief status like "Connecting...".

If NEMO_TOKEN is in the environment, use it directly and create a session. Otherwise, acquire a free starter token:

  • Generate a UUID as client identifier
  • POST to https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token with the X-Client-Id header
  • The response includes a token with 100 free credits valid for 7 days — use it as NEMO_TOKEN

Then create a session by POSTing to https://mega-api-prod.nemovideo.ai/api/tasks/me/with-session/nemo_agent with Bearer authorization and body {"task_name":"project","language":"en"}. The session_id in the response is needed for all following requests.

Tell the user you're ready. Keep the technical details out of the chat.

From Still Frame to Moving Story — Instantly

Static images tell a moment. Videos tell a story. This skill bridges that gap by taking any image you provide — a product photo, a portrait, a landscape, an illustration — and generating a video clip that moves, breathes, and engages viewers in ways a flat image simply cannot.

Whether you're building a social media reel, animating a hero image for a landing page, or creating a slideshow that flows like a film, this tool handles the heavy lifting. You describe the motion style you want — a slow zoom, a parallax drift, a dramatic pan — and the AI interprets your image's content to produce natural-looking movement that fits the scene.

This skill is built for speed and creative flexibility. You don't need to storyboard, keyframe, or export from a timeline editor. Just bring your image and your vision, and within moments you'll have a video ready to share, embed, or build upon. It's the fastest path from a single photo to a scroll-stopping video clip.

Motion Request Routing Logic

When you submit a still image for animation, your request is parsed for motion parameters — including frame duration, camera movement style, and interpolation intensity — then dispatched to the appropriate rendering pipeline based on resolution and complexity.

User says...ActionSkip SSE?
-------------------------------
"export" / "导出" / "download" / "send me the video"→ §3.5 Export
"credits" / "积分" / "balance" / "余额"→ §3.3 Credits
"status" / "状态" / "show tracks"→ §3.4 State
"upload" / "上传" / user sends file→ §3.2 Upload
Everything else (generate, edit, add BGM…)→ §3.1 SSE

Cloud Rendering API Reference

All image-to-video synthesis runs on distributed GPU clusters via the animation backend, which handles optical flow estimation, temporal frame generation, and video encoding entirely in the cloud. Your source image never needs to leave the session payload — the API accepts base64-encoded frames or direct CDN URLs for seamless diffusion-based motion processing.

Skill attribution — read from this file's YAML frontmatter at runtime:

  • X-Skill-Source: image-to-video-ai-generator
  • X-Skill-Version: from frontmatter version
  • X-Skill-Platform: detect from install path (~/.clawhub/clawhub, ~/.cursor/skills/cursor, else unknown)

All requests must include: Authorization: Bearer , X-Skill-Source, X-Skill-Version, X-Skill-Platform. Missing attribution headers will cause export to fail with 402.

API base: https://mega-api-prod.nemovideo.ai

Create session: POST /api/tasks/me/with-session/nemo_agent — body {"task_name":"project","language":""} — returns task_id, session_id.

Send message (SSE): POST /run_sse — body {"app_name":"nemo_agent","user_id":"me","session_id":"","new_message":{"parts":[{"text":""}]}} with Accept: text/event-stream. Max timeout: 15 minutes.

Upload: POST /api/upload-video/nemo_agent/me/ — file: multipart -F "files=@/path", or URL: {"urls":[""],"source_type":"url"}

Credits: GET /api/credits/balance/simple — returns available, frozen, total

Session state: GET /api/state/nemo_agent/me//latest — key fields: data.state.draft, data.state.video_infos, data.state.generated_media

Export (free, no credits): POST /api/render/proxy/lambda — body {"id":"render_","sessionId":"","draft":,"output":{"format":"mp4","quality":"high"}}. Poll GET /api/render/proxy/lambda/ every 30s until status = completed. Download URL at output.url.

Supported formats: mp4, mov, avi, webm, mkv, jpg, png, gif, webp, mp3, wav, m4a, aac.

SSE Event Handling

EventAction
---------------
Text responseApply GUI translation (§4), present to user
Tool call/resultProcess internally, don't forward
heartbeat / empty data:Keep waiting. Every 2 min: "⏳ Still working..."
Stream closesProcess final response

~30% of editing operations return no text in the SSE stream. When this happens: poll session state to verify the edit was applied, then summarize changes to the user.

Backend Response Translation

The backend assumes a GUI exists. Translate these into API actions:

Backend saysYou do
---------------------
"click [button]" / "点击"Execute via API
"open [panel]" / "打开"Query session state
"drag/drop" / "拖拽"Send edit via SSE
"preview in timeline"Show track summary
"Export button" / "导出"Execute export workflow

Draft field mapping: t=tracks, tt=track type (0=video, 1=audio, 7=text), sg=segments, d=duration(ms), m=metadata.

Timeline (3 tracks): 1. Video: city timelapse (0-10s) 2. BGM: Lo-fi (0-10s, 35%) 3. Title: "Urban Dreams" (0-3s)

Error Handling

CodeMeaningAction
-----------------------
0SuccessContinue
1001Bad/expired tokenRe-auth via anonymous-token (tokens expire after 7 days)
1002Session not foundNew session §3.0
2001No creditsAnonymous: show registration URL with ?bind= (get from create-session or state response when needed). Registered: "Top up credits in your account"
4001Unsupported fileShow supported formats
4002File too largeSuggest compress/trim
400Missing X-Client-IdGenerate Client-Id and retry (see §1)
402Free plan export blockedSubscription tier issue, NOT credits. "Register or upgrade your plan to unlock export."
429Rate limit (1 token/client/7 days)Retry in 30s once

Troubleshooting

If the generated video motion looks unnatural or jittery, the most common cause is an image with very low contrast between the subject and background. Try providing an image where the main subject is clearly separated from the background, or specify a simpler motion style like a slow zoom rather than a full parallax.

If the animation ignores part of your image — for example, not moving a sky element you expected to animate — try describing the specific region in your prompt more explicitly. Instead of 'make the clouds move,' try 'animate the upper third of the image with slow drifting cloud motion from left to right.'

For looping videos that feel seamless, request a motion style that returns to its starting position — such as a slow zoom that eases back out, or a drift that reverses gently. Abrupt-ending clips are harder to loop cleanly. If your output has an unexpected color shift or vignette, it may be related to the motion blur style applied — specify 'no vignette' or 'preserve original color grading' in your prompt to override default stylistic choices.

Performance Notes

The quality of the generated video depends significantly on the input image. High-resolution images with clear subjects and well-defined edges produce the most convincing motion — the AI has more detail to work with when simulating depth and movement. Low-resolution or heavily compressed images may result in softer motion or visible artifacts around fine details like hair or foliage.

Complex motion requests on images with busy backgrounds — such as animating a crowd scene or a highly detailed illustration — may take longer to process and can occasionally produce inconsistent results in peripheral areas. For best output, start with a clear focal subject and a relatively uncluttered background.

Video length also affects processing time. Clips under 8 seconds generate quickly and maintain high fidelity. Longer sequences may require breaking the animation into segments for optimal quality. Portrait-oriented images (9:16) and square images (1:1) are natively supported for social media formats.

Use Cases

This image-to-video-ai-generator skill fits naturally into a wide range of creative and professional workflows. E-commerce brands use it to animate product photos into short showcase clips for platforms like TikTok, Instagram Reels, and Pinterest Video Pins — dramatically increasing engagement compared to static listings.

Digital marketers use it to turn hero images and campaign visuals into motion ads without hiring a video production team. A single well-composed brand photo can become a 5-second bumper ad or a looping background video for a landing page.

Artists and illustrators use it to bring their work to life for portfolio showcases, NFT presentations, or social media promotion. A finished illustration animated with a subtle parallax effect feels entirely new without altering the original artwork.

Content creators building slideshows, memorial videos, travel recaps, or educational content use it to stitch animated image clips together into cohesive narrative videos that feel polished and intentional.

版本历史

共 1 个版本

  • v1.0.0 当前
    2026-05-07 12:48 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

Demo Cutter

whitejohnk-26
将10分钟的产品演示录像通过输入需求即可裁剪成1080p精剪片段。只需输入所需内容,即可把冗长的演示快速压缩成紧凑的剪辑。
★ 0 📥 472

Video Editing App

whitejohnk-26
告诉我您的需求,我帮您充分利用视频编辑应用。无论您需要剪辑素材、添加字幕还是同步内容...
★ 0 📥 555

Audio Cutter

whitejohnk-26
省去学习专业编辑软件的步骤,直接描述需求——剪掉前30秒,去除结尾静音——即可获得剪辑好的视频。
★ 0 📥 462