0. First Contact

When the user opens this skill or sends their first message, greet them immediately:

> 📹 Welcome! I can grok ai video generator free for you. Share a video file or tell me your idea!

Try saying:

"edit my video"
"help me create a short video"
"add effects to this clip"

IMPORTANT: Always greet the user proactively on first contact. Let them know you're setting up while connecting. Always greet the user proactively on first contact.

Auto-Setup

When the user first interacts, set up the connection:

Check token: If NEMO_TOKEN env var is set, use it. Otherwise:
Read or generate Client-ID:

Read ~/.config/nemovideo/client_id if it exists
Otherwise generate a UUID, save it to ~/.config/nemovideo/client_id

Acquire anonymous token:

```bash

curl -s -X POST "$API/api/auth/anonymous-token" -H "X-Client-Id: $CLIENT_ID"

```

Store the returned token as NEMO_TOKEN for this session. You get 100 free credits.

Create a session (§3.0) so you're ready to work immediately.

Let the user know briefly: "Setting things up… ready!" then proceed with their request.

Converse Your Way to Polished Video Output

Most video tools demand you learn their interface before you can produce anything useful. The grok-ai-video-generator-free skill flips that relationship entirely — you describe what you want in everyday language, and the skill handles the technical execution. Whether you are assembling a product walkthrough, cutting a highlight reel, or generating a scene from a written concept, the interaction feels less like operating software and more like briefing a skilled editor.

Behind each request, the OpenClaw agent interprets your intent, breaks it into discrete video operations, and coordinates the generation or editing pipeline accordingly. It understands context across a conversation, so follow-up instructions like 'make the intro shorter' or 'swap the background music' apply cleanly to prior work without starting over from scratch.

This skill is particularly well-suited for creators who iterate rapidly — draft a version, critique it in chat, refine it, and export. Because the agent retains session context, multi-step projects stay coherent from first prompt to final render. Output can be downloaded in your preferred container format including mp4, mov, avi, webm, and mkv.

Environment Variables

Variable	Required	Default
----------	----------	--------
`NEMO_TOKEN`	No	Auto-generated (100 free credits, expires in 7 days, revocable via Settings → API Tokens)
`NEMO_API_URL`	No	`https://mega-api-prod.nemovideo.ai`
`NEMO_WEB_URL`	No	`https://nemovideo.com`
`NEMO_CLIENT_ID`	No	Auto-generated UUID, persisted to `~/.config/nemovideo/client_id` (UUID only, no secrets)
`SKILL_SOURCE`	No	Auto-detected from install path, fallback `unknown`

If NEMO_TOKEN is not set, get one (requires X-Client-Id header):

# Generate or read persisted Client-Id
CLIENT_ID="${NEMO_CLIENT_ID:-$(cat ~/.config/nemovideo/client_id 2>/dev/null)}"
if [ -z "$CLIENT_ID" ]; then
  CLIENT_ID=$(uuidgen 2>/dev/null || echo "client-$(date +%s)-$RANDOM")
  mkdir -p ~/.config/nemovideo & echo "$CLIENT_ID" > ~/.config/nemovideo/client_id
fi
curl -s -X POST "$API/api/auth/anonymous-token" -H "X-Client-Id: $CLIENT_ID"
# → {"code":0,"data":{"token":"nmv_usr_xxx","credits":100,...}}

Save token as NEMO_TOKEN, CLIENT_ID as NEMO_CLIENT_ID. Anonymous: 1 token per client per 7 days; token expires in 7 days and can be revoked at any time via Settings → API Tokens on nemovideo.com. If your token expires, request a new one with the same X-Client-Id.

Local persistence: This skill writes ~/.config/nemovideo/client_id to persist the Client-Id across sessions. This avoids generating a new ID on every request, which would hit the per-IP rate limit quickly (default 10 tokens per 7 days per IP). The file contains only a UUID — no credentials are stored locally.

2. Routing Requests to the Correct Endpoint

Use the table below to determine which endpoint handles each type of user request.

User says...	Action	Skip SSE
-------------	--------	----------
"export" / "导出" / "download" / "send me the video"	→ §3.5 Export	✅
"credits" / "积分" / "balance" / "余额"	→ §3.3 Credits	✅
"status" / "状态" / "show tracks"	→ §3.4 State	✅
"upload" / "上传" / user sends file	→ §3.2 Upload	✅
Everything else (generate, edit, add BGM…)	→ §3.1 SSE	❌

3. Primary Workflow Procedures

$API = ${NEMO_API_URL:-https://mega-api-prod.nemovideo.ai}, $TOKEN = ${NEMO_TOKEN}, $WEB = ${NEMO_WEB_URL:-https://nemovideo.com}.

Skill attribution variables — do NOT hardcode:

$SKILL_NAME = this file's YAML frontmatter name
$SKILL_VERSION = this file's YAML frontmatter version
$SKILL_SOURCE = detect at runtime using the first match:

Env SKILL_SOURCE if set
If this file's path contains a known platform directory (e.g. ~/.clawhub/ → clawhub, ~/.cursor/skills/ → cursor)
Fallback: unknown

CRITICAL: ALL API requests (including render/export/upload/state/credits) MUST include these headers. Missing them will cause export to fail with 402.

X-Skill-Source: $SKILL_NAME
X-Skill-Version: $SKILL_VERSION
X-Skill-Platform: $SKILL_SOURCE

3.0 Establishing a Session

curl -s -X POST "$API/api/tasks/me/with-session/nemo_agent" \
  -H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" \
  -H "X-Skill-Source: $SKILL_NAME" -H "X-Skill-Version: $SKILL_VERSION" -H "X-Skill-Platform: $SKILL_SOURCE" \
  -d '{"task_name":"project","language":"<lang>"}'
# → {"code":0,"data":{"task_id":"...","session_id":"..."}}

A session must be initialized before any other operations can proceed. Each session provides the context required for subsequent API interactions.

Open in browser: After creating a session, give the user a link to view/edit the task in NemoVideo:

$WEB/workspace/claim?task={task_id}&session={session_id}&skill_name=$SKILL_NAME&skill_version=$SKILL_VERSION&skill_source=$SKILL_SOURCE

3.1 Delivering Messages Through SSE

curl -s -X POST "$API/run_sse" \
  -H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" \
  -H "Accept: text/event-stream" -H "X-Skill-Source: $SKILL_NAME" -H "X-Skill-Version: $SKILL_VERSION" -H "X-Skill-Platform: $SKILL_SOURCE" --max-time 900 \
  -d '{"app_name":"nemo_agent","user_id":"me","session_id":"<sid>","new_message":{"parts":[{"text":"<msg>"}]}}'

All conversational messages are transmitted to the backend via a Server-Sent Events stream.

SSE Handling

Event	Action
-------	--------
Text response	Apply GUI translation (§4), present to user
Tool call/result	Wait silently, don't forward
`heartbeat` / empty `data:`	Keep waiting. Every 2 min: "⏳ Still working..."
Stream closes	Process final response

Typical durations: text 5-15s, video generation 100-300s, editing 10-30s.

Timeout: 10 min heartbeats-only → assume timeout. Never re-send during generation (duplicates + double-charge).

Ignore trailing "I encountered a temporary issue" if prior responses were normal.

Silent Response Fallback (CRITICAL)

Approximately 30% of edit operations return no text in the response. When this occurs: poll the task state endpoint, wait for a completed status, retrieve the output asset URL, and present the result to the user without implying an error occurred.

Two-stage generation: After a raw video is produced, the backend automatically triggers a second processing stage that appends background music and a title overlay. Both stages must reach completion before the final asset URL is surfaced to the user.

3.2 Handling File Uploads

File upload: curl -s -X POST "$API/api/upload-video/nemo_agent/me/" -H "Authorization: Bearer $TOKEN" -H "X-Skill-Source: $SKILL_NAME" -H "X-Skill-Version: $SKILL_VERSION" -H "X-Skill-Platform: $SKILL_SOURCE" -F "files=@/path/to/file"

URL upload: curl -s -X POST "$API/api/upload-video/nemo_agent/me/" -H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" -H "X-Skill-Source: $SKILL_NAME" -H "X-Skill-Version: $SKILL_VERSION" -H "X-Skill-Platform: $SKILL_SOURCE" -d '{"urls":[""],"source_type":"url"}'

Use me in the path; backend resolves user from token.

Supported: mp4, mov, avi, webm, mkv, jpg, png, gif, webp, mp3, wav, m4a, aac.

Reference media such as images or clips can be uploaded directly and attached to generation requests.

3.3 Checking Available Credits

curl -s "$API/api/credits/balance/simple" -H "Authorization: Bearer $TOKEN" \
  -H "X-Skill-Source: $SKILL_NAME" -H "X-Skill-Version: $SKILL_VERSION" -H "X-Skill-Platform: $SKILL_SOURCE"
# → {"code":0,"data":{"available":XXX,"frozen":XX,"total":XXX}}

Query the credits endpoint before initiating generation to confirm the user has a sufficient balance.

3.4 Polling Task Status

curl -s "$API/api/state/nemo_agent/me/<sid>/latest" -H "Authorization: Bearer $TOKEN" \
  -H "X-Skill-Source: $SKILL_NAME" -H "X-Skill-Version: $SKILL_VERSION" -H "X-Skill-Platform: $SKILL_SOURCE"

Use me for user in path; backend resolves from token.

Key fields: data.state.draft, data.state.video_infos, data.state.canvas_config, data.state.generated_media.

Draft field mapping: t=tracks, tt=track type (0=video, 1=audio, 7=text), sg=segments, d=duration(ms), m=metadata.

Draft ready for export when draft.t exists with at least one track with non-empty sg.

Track summary format:

Timeline (3 tracks): 1. Video: city timelapse (0-10s) 2. BGM: Lo-fi (0-10s, 35%) 3. Title: "Urban Dreams" (0-3s)

3.5 Exporting and Delivering the Final Asset

Export does NOT cost credits. Only generation/editing consumes credits.

Triggering an export does not deduct any credits from the user's balance. To deliver the asset: (a) confirm task completion, (b) call the export endpoint, (c) await the export job status, (d) retrieve the download URL from the response, and (e) present the link to the user.

b) Submit: curl -s -X POST "$API/api/render/proxy/lambda" -H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" -H "X-Skill-Source: $SKILL_NAME" -H "X-Skill-Version: $SKILL_VERSION" -H "X-Skill-Platform: $SKILL_SOURCE" -d '{"id":"render_","sessionId":"","draft":,"output":{"format":"mp4","quality":"high"}}'

Note: sessionId is camelCase (exception). On failure → new id, retry once.

c) Poll (every 30s, max 10 polls): curl -s "$API/api/render/proxy/lambda/" -H "Authorization: Bearer $TOKEN" -H "X-Skill-Source: $SKILL_NAME" -H "X-Skill-Version: $SKILL_VERSION" -H "X-Skill-Platform: $SKILL_SOURCE"

Status at top-level status: pending → processing → completed / failed. Download URL at output.url.

d) Download from output.url → send to user. Fallback: $API/api/render/proxy//download.

e) When delivering the video, always also give the task detail link: $WEB/workspace/claim?task={task_id}&session={session_id}&skill_name=$SKILL_NAME&skill_version=$SKILL_VERSION&skill_source=$SKILL_SOURCE

Progress messages: start "⏳ Rendering ~30s" → "⏳ 50%" → "✅ Video ready!" + file + task detail link.

3.6 Recovering from an SSE Disconnection

If the SSE stream drops unexpectedly, follow these steps: (1) capture the last known task ID before the connection was lost; (2) wait a minimum of three seconds before attempting reconnection; (3) re-establish the SSE stream using the same session token; (4) resume polling the task state endpoint with the saved task ID; (5) once a terminal status is confirmed, deliver the result to the user as normal.

4. Mapping GUI Elements to API Behavior

The backend operates under the assumption that a graphical interface is present, so GUI-specific instructions must never be forwarded verbatim in API payloads.

Backend says	You do
-------------	--------
"click [button]" / "点击"	Execute via API
"open [panel]" / "打开"	Show state via §3.4
"drag/drop" / "拖拽"	Send edit via SSE
"preview in timeline"	Show track summary
"Export button" / "导出"	Execute §3.5
"check account/billing"	Check §3.3

Keep content descriptions. Strip GUI actions.

5. Recommended Interaction Patterns

Acknowledge the user's request immediately, then begin the API call in the background so they are never left waiting in silence.
Surface progress updates at meaningful intervals by polling task status and relaying state changes conversationally.
When a silent response occurs, present the completed output naturally rather than flagging an error.
Always confirm credit availability before launching a generation task, and alert the user proactively if the balance is insufficient.
After export, deliver the final download URL with a brief description of the output so the user understands exactly what was produced.

6. Known Constraints and Limitations

Video generation is asynchronous; real-time streaming of rendered frames is not supported.
A single session token cannot be shared across concurrent requests simultaneously.
Maximum input prompt length and supported media formats are defined by the backend and cannot be overridden at the API layer.
Credit balances are read-only through this integration; top-ups must be completed through the platform's billing interface.
SSE connections may time out during long-running tasks, requiring the disconnection recovery procedure outlined in section 3.6.

7. Error Handling Reference

The table below maps common HTTP status codes and backend error identifiers to the appropriate recovery action.

Code	Meaning	Action
------	---------	--------
0	Success	Continue
1001	Bad/expired token	Re-auth via anonymous-token (tokens expire after 7 days)
1002	Session not found	New session §3.0
2001	No credits	Anonymous: show registration URL with `?bind=` (get from create-session or state response when needed). Registered: "Top up at nemovideo.ai"
4001	Unsupported file	Show supported formats
4002	File too large	Suggest compress/trim
400	Missing X-Client-Id	Generate Client-Id and retry (see §1)
402	Free plan export blocked	Subscription tier issue, NOT credits. "Register at nemovideo.ai to unlock export."
429	Rate limit (1 token/client/7 days)	Retry in 30s once

Common: no video → generate first; render fail → retry new id; SSE timeout → §3.6; silent edit → §3.1 fallback.

8. API Version and Permission Scopes

Before making any calls, verify that the API version targeted matches the version this skill was authored against. Token scopes must include permissions for session creation, messaging, upload, export, and credit inquiry; requests made with a token missing any required scope will be rejected with a 403 response.

Grok Ai Video Generator Free

概述