← 返回
未分类 Key

Zyka AI

Generate AI videos, images, voice, and use AI apps using the Zyka CLI. Use when users want to create AI-generated media — videos (Sora, Veo, Kling, WAN, Seed...
使用Zyka CLI生成AI视频、图片、语音及AI应用,适用于需要创建AI生成媒体(如Sora、Veo、Kling、 WAN、Seed等)时。
varshneyhars varshneyhars 来源
未分类 clawhub v0.4.12 1 版本 100000 Key: 需要
★ 2
Stars
📥 341
下载
💾 0
安装
1
版本
#latest

概述

Zyka AI Media Generation

Generate AI videos, images, voice, and use AI-powered apps directly from the terminal using the zyka CLI. One command, 40+ AI models. No code writing required.

Setup

The zyka CLI is published on npm: https://www.npmjs.com/package/zyka

Source: https://github.com/kshitijdixit/zyka-sdk

Required: install the CLI once before using this skill. All examples below assume zyka is on your PATH.

npm install -g zyka            # one-time install (latest)
zyka --help                    # verify install

Set your API key (required — get one at https://zyka.ai/settings/api-keys):

export ZYKA_API_KEY=zk_live_...

Security & Privacy

Before using this skill, understand what runs and where data flows:

  • Local install required: the skill calls the locally installed zyka binary. No remote code is fetched at runtime. Install pulls the latest published version from npm (https://www.npmjs.com/package/zyka); inspect the package source at https://github.com/kshitijdixit/zyka-sdk before installing if you want stronger assurance.
  • Outbound data flow: any local files you reference (images, audio, video) are uploaded to Zyka's API. Zyka proxies generation requests to third-party model providers (OpenAI, Google, ByteDance, ElevenLabs, Kling, etc.). Do not pass private personal data, proprietary content, credentials, or files you don't want third parties to see. Review https://zyka.ai privacy/terms for retention and downstream-provider policies.
  • API key handling: treat ZYKA_API_KEY as a service secret. Use a least-privilege key, never commit it, and rotate or delete it when no longer needed. Monitor usage in your Zyka dashboard.
  • Autonomous use: this skill is designed for agent-driven invocation. The agent can call Zyka endpoints (and spend credits) without an extra prompt once the skill is enabled. If you want explicit user confirmation per call, run commands interactively rather than relying on autonomous invocation.

CRITICAL: Always use CLI commands, never write code scripts

When a user asks to generate media, run the zyka generate command directly — do NOT write JavaScript files. The CLI handles file uploads, waiting, and downloading.


Generate Videos

zyka generate video -m MODEL -p "prompt" [options]

Video Models

Model-m valueKey options
---------
OpenAI Sorasora-s sora-2, -d 4/8/12, --size 1280x720
Google Veoveo-s veo-3.1-generate-001, -d 4-8, -a 16:9/9:16, --size 720p/1080p/4k
Kling AIkling-s kling-v2-master, -d 5/10, -a 16:9, --mode std/pro
Kling V3kling-s kling-v3 or kling-v3-pro, -d 3-15, --first-frame ./img.jpg
Kling O3kling-s kling-o3 or kling-o3-pro, -d 3-15
Kling Omnikling-s kling-video-o1, -d 3-10
Kling Multi-Imagekling-s multi-image-to-video (pass image_list via SDK)
ByteDance Seedance V1.5 Probytedance-s "Seedance V1.5 Pro", -d 4-12, --resolution 720p
ByteDance Seedance 2.0bytedance-s "Seedance 2.0", -d 4-12, --resolution 720p
ByteDance Seedance 2.0 Fastbytedance-s "Seedance 2.0 Fast", -d 4-12, --resolution 720p
ByteDance OmniHumanbytedance-s "OmniHuman", --image ./face.jpg --audio ./speech.mp3
ByteDance OmniHuman v1.5bytedance-s "OmniHuman v1.5", --image ./face.jpg --audio ./speech.mp3
Alibaba WAN T2Vwan-s wan-2-6-t2v, -d 5/10/15, --size 1280*720
Alibaba WAN 2.7wan-s wan-2-7, -d 5/10/15, --size 1280*720
Alibaba WAN I2Vwan-s wan-2-6-i2v, --image ./img.jpg, -d 5/10/15
Alibaba WAN I2V 2.5wan-s wan-2-5-i2v, --image ./img.jpg, -d 5/10/15
WAN Animate Replacewan-s wan-v2-2-animate-replace, --video ./vid.mp4 --image ./char.png
WAN Animate Movewan-s wan-v2-2-animate-move, --video ./vid.mp4 --image ./char.png
Talking Headinfinite_talk--image ./face.jpg --audio ./speech.mp3
Aurora (Lip Sync)aurora--video ./face.mp4 --audio ./speech.mp3, --audio-guidance-scale 7
LTX Video T2Vltx-s ltx-2.3-text-to-video
LTX Video I2Vltx-s ltx-2.3-image-to-video, --image ./img.jpg
Grok Videogrok-s grok-imagine-video, -d 1-15, --resolution 720p

Video Examples

Text to video:

zyka generate video -m wan -p "A cinematic sunset over mountains" -d 5 -o ./sunset.mp4

Image to video (animate a photo):

zyka generate video -m kling -s kling-v2-master -p "gentle zoom in with wind" --image ./photo.jpg -d 5 -a 16:9 --mode pro -o ./animated.mp4

Talking head (lip sync):

zyka generate video -m infinite_talk -p "lip sync" --image ./face.jpg --audio ./speech.mp3 -o ./talking.mp4

Aurora lip sync (video + audio):

zyka generate video -m aurora --video ./face.mp4 --audio ./speech.mp3 -o ./lipsync.mp4

First/last frame interpolation:

zyka generate video -m veo -s veo-3.1-generate-001 -p "smooth transition" --first-frame ./start.jpg --last-frame ./end.jpg -d 8 -a 16:9 -o ./transition.mp4

LTX video:

zyka generate video -m ltx -s ltx-2.3-text-to-video -p "A flowing river through autumn forest" -o ./ltx.mp4

Grok video:

zyka generate video -m grok -p "Medieval knight in mystical forest" -d 6 --resolution 720p -o ./knight.mp4

WAN animate replace (swap character in video):

zyka generate video -m wan -s wan-v2-2-animate-replace --video ./original.mp4 --image ./new-character.png -o ./swapped.mp4

Generate Images

zyka generate image -m MODEL -p "prompt" [options]

Image Models

Model-m valueNotes
---------
Nano Banananano_banana-s nano-banana-1 (default), nano-banana-pro (4K), nano-banana-2 (fast 4K)
DALL·E 2dall_e_2--size 256x256/512x512/1024x1024
DALL·E 3dall_e_3--quality standard/hd, --style vivid/natural
GPT Image 1gpt_image_1--background transparent, --quality auto/low/medium/high
GPT Image 1 Minigpt_image_1_miniCheaper variant
GPT Image 1.5gpt_image_1_5Latest OpenAI
Flux Schnellflux_1_schnellFast
Flux 2 Devflux_2_devHigh quality
Flux 2 Klein 9Bflux_2_klein_9bCompact high quality
Kling Imagekling-s kling-v1, kling-v2, kling-image-v3, kling-image-v3-text-to-image, omni-image, kling-image-o1, multi-image-to-image
SD XLstable_diffusion_xl_base_1_0--negative-prompt "blurry"
SD img2imgstable_diffusion_v1_5_img2imgNeeds --image
Lucid Originlucid_originLeonardo AI
Phoenix 1.0phoenix_1_0Leonardo AI
Zyka Helionzyka_helionFast Zyka-native
Grok Imaginegrok_imaginexAI Grok
Qwen Image 2 Proqwen_image_2_proChinese/English support
Z Image Turboz_image_turboFast

Image Examples

Generate from text:

zyka generate image -m gpt_image_1 -p "A neon cyberpunk cityscape" -o ./city.png

Edit an existing image:

zyka generate image -m nano_banana -s nano-banana-pro -p "make the hair straight" --image ./me.png -o ./result.png

4K high-res image:

zyka generate image -m nano_banana -s nano-banana-2 -p "cinematic portrait" --resolution 4K --size 5504x3072 -o ./portrait.png

Transparent background:

zyka generate image -m gpt_image_1 -p "product photo of sneakers" --background transparent -o ./sneakers.png

Grok image:

zyka generate image -m grok_imagine -p "Abstract golden particles, data visualization style" -o ./abstract.png

Qwen image:

zyka generate image -m qwen_image_2_pro -p "A serene mountain landscape at sunset" -o ./landscape.png

Zyka Helion (fast):

zyka generate image -m zyka_helion -p "cyberpunk cat in neon city" -o ./cat.png

Generate Text-to-Speech

zyka generate tts --script "text" [options]

TTS Providers

Provider--provider valueNotes
---------
ElevenLabselevenlabsNeeds --voice-id
Qwen3qwen3Voice design/clone/custom
ChatterboxchatterboxClone + emotion tags [happy], [sad], [angry], [calm]
VoxCPMvoxcpmVoice cloning only (requires reference audio)
MiniMaxminimax17 preset voices (Wise_Woman, Friendly_Person, Deep_Voice_Man, etc.)
MOSS-TTSmoss-ttsRunPod-based
Fish Audiofish-audioInstant voice cloning

TTS Examples

Generate speech:

zyka generate tts --provider elevenlabs --voice-id VOICE_ID --script "Welcome to Zyka" -o ./speech.mp3

Clone a voice:

zyka generate tts --provider chatterbox --voice ./my-voice.mp3 --script "[happy] This sounds like me!" -o ./cloned.mp3

Fish Audio:

zyka generate tts --provider fish-audio --voice ./sample.wav --script "Hello world" -o ./fish.mp3

MiniMax preset voice:

zyka generate tts --provider minimax --script "Hello! Welcome to MiniMax TTS." -o ./minimax.mp3

MiniMax with emotion and stereo:

zyka generate tts --provider minimax --script "Hello!" --emotion happy --channel 2 -o ./minimax.mp3

AI Apps (via SDK)

These apps are available through the ZykaClient SDK:

const { ZykaClient } = require('zyka-sdk');
const client = new ZykaClient();

Image Apps

AppMethodParams
---------
UpscalecreateUpscale(){ image, resolution: '1k'/'2k'/'4k' }
Face SwapcreateFaceSwap(){ type: 'image'/'video', url, face_image }
Virtual Try-OncreateVirtualTryOn(){ human_image, cloth_image }
Outfit SwapcreateOutfitSwap(){ user_image, character_image }
Skin EnhancercreateSkinEnhancer(){ image, type: 'perfect_skin'/'realistic_skin'/'imperfect_skin' }
Behind the ScenecreateBehindTheScene(){ image, type: 'image'/'video' }
Camera AnglescreateAngles(){ image, angle: { azimuth, elevation } }
9 ShortscreateNineShorts(){ image }
ZoomscreateZooms(){ image }
Story GeneratorcreateStoryGenerator(){ image }
Holi SpecialcreateHoliSpecial(){ image }
Simple AppcreateSimpleApp(){ image, app_id?, prompt? }

Video Apps

AppMethodParams
---------
Caption GeneratorcreateCaptionGenerator(){ url, language?, caption_style? }
Video to ScriptcreateVideoToScript(){ url, script_style?: 'general'/'screenplay'/'blog_post'/'social_media'/'documentary' }
Video CleanercreateVideoCleaner(){ url, language? }
Video UpscalercreateVideoUpscaler(){ video_url, target_resolution: '1080p'/'2k'/'4k', target_fps: '30fps'/'60fps' }
Video Dubbing (HeyGen)createVideoDubbing(){ video_url, model: 'heygen', output_language: 'Hindi (India)', mode?, enable_caption?, enable_speech_enhancement?, translate_audio_only? }
Video Dubbing (ElevenLabs)createVideoDubbing(){ video_url, model: 'elevenlabs', output_language: 'hi', source_lang?, num_speakers?, highest_resolution?, drop_background_audio?, use_profanity_filter? }
Video Dubbing (Sarvam)createVideoDubbing(){ video_url, model: 'sarvam', output_language: 'Hindi' (or comma-separated), source_lang?, num_speakers?, genre?: 'general'/'news'/'entertainment'/'education'/'sports'/'religious' }
Dubbing LanguagesgetVideoDubbingLanguages(model)model: 'heygen'/'elevenlabs'/'sarvam' — returns supported languages
Short Video CreatorcreateShortVideoCreator(){ url, clip_duration_sec: 'auto'/5/15/30/45 }
B-rollcreateBroll(){ url, broll_duration_sec?: 'auto'/2-10 }
YouTube DownloadercreateYouTubeDownloader(){ url, quality?: '720p', format?: 'mp4' }
Voice ChangercreateVoiceChanger(){ source_audio_url, target_voice_url?, voice_strength? }
Image to SVGcreateImageToSvg(){ image_url }

CLI App Commands

zyka generate upscale --image ./photo.jpg --resolution 4k -o ./upscaled.jpg
zyka generate face-swap --type image --url ./target.jpg --face ./face.jpg -o ./result.jpg
zyka generate skin-enhancer --image ./photo.jpg --type perfect_skin -o ./enhanced.jpg
zyka generate virtual-try-on --human ./me.jpg --cloth ./dress.jpg -o ./tryon.jpg
zyka generate outfit-swap --user-image ./me.jpg --character-image ./celeb.jpg -o ./outfit.jpg
zyka generate behind-the-scene --image ./photo.jpg --type image -o ./scene.jpg
zyka generate nine-shorts --image ./photo.jpg -o ./angles.jpg
zyka generate zooms --image ./photo.jpg -o ./zooms.jpg
zyka generate story-generator --image ./photo.jpg -o ./story.jpg
zyka generate holi-special --image ./photo.jpg -o ./holi.jpg
zyka generate simple-app --image ./photo.jpg --app-id my-app -o ./result.jpg
zyka generate caption --url ./video.mp4 --language en -o ./captioned.mp4
zyka generate video-to-script --url ./video.mp4 --script-style screenplay -o ./script.txt
zyka generate video-cleaner --url ./video.mp4 -o ./cleaned.mp4
zyka generate video-upscaler --url ./video.mp4 --resolution 4k -o ./upscaled.mp4
zyka generate video-dubbing --url ./video.mp4 --model heygen --language "Hindi (India)" --mode precision -o ./dubbed.mp4
zyka generate video-dubbing --url ./video.mp4 --model elevenlabs --language hi --source-lang en --num-speakers 2 --highest-resolution -o ./dubbed.mp4
zyka generate video-dubbing --url ./video.mp4 --model sarvam --language "Hindi,Tamil" --genre education -o ./dubbed.mp4
zyka generate short-video --url ./long.mp4 --duration auto -o ./clips/
zyka generate broll --url ./video.mp4 -o ./with-broll.mp4
zyka generate youtube-download --url "https://youtube.com/watch?v=..." --quality 720p -o ./video.mp4
zyka generate voice-changer --audio ./input.mp3 --voice ./reference.mp3 -o ./output.mp3
zyka generate image-to-svg --image ./photo.png -o ./result.svg

Guidelines

  • ALWAYS use zyka generate CLI commands — never write JavaScript files
  • Set ZYKA_API_KEY env var before running commands
  • Use -o ./filename to save results to disk (auto-downloads)
  • Use --image ./path to pass local image files (auto-uploads)
  • Use --audio ./path to pass local audio files (auto-uploads)
  • Use --video ./path to pass local video files (auto-uploads)
  • All commands auto-wait for completion — no polling needed
  • When editing images, default to -m nano_banana -s nano-banana-pro
  • When generating videos from text, default to -m wan
  • When animating images to video, default to -m kling -s kling-v2-master
  • When generating 4K images, use -m nano_banana -s nano-banana-2 --resolution 4K
  • For fast image generation, use -m zyka_helion or -m flux_1_schnell
  • For transparent backgrounds, use -m gpt_image_1 --background transparent

CLI Options Reference

FlagDescription
------
-m, --modelModel name (required for video/image)
-p, --promptText prompt (required)
-s, --sub-modelModel variant
-d, --durationVideo duration in seconds
-a, --aspect-ratioAspect ratio (16:9, 9:16, 1:1)
--sizeOutput size (e.g. 1024x1024, 720p)
--imageInput image path (for editing/animating)
--audioInput audio path (for talking heads)
--videoInput video path (for V2V)
-o, --outputSave result to this file path
--no-waitDon't wait for completion
--negative-promptWhat to avoid in generation
--modeKling mode: std or pro
--resolutionResolution: 480p, 720p, 1080p, 1K, 2K, 4K
--first-frameFirst frame image (Kling, Veo 3.1, Bytedance)
--last-frameLast frame image (Kling, Veo 3.1, Bytedance)
--qualityQuality: standard, hd, auto, low, medium, high
--backgroundGPT Image background: transparent, opaque, auto
--styleDALL-E 3 style: vivid, natural
-n, --countNumber of images to generate
--titleTitle for the generation job
--audio-guidance-scaleAurora lip sync guidance (0-10)
--emotionMiniMax TTS emotion (neutral/happy/sad/angry/fearful/disgusted/surprised)
--channelMiniMax TTS audio channel (1=mono, 2=stereo)

版本历史

共 1 个版本

  • v0.4.12 当前
    2026-05-07 20:29 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

design-media

Video Frames

steipete
使用 ffmpeg 从视频中提取帧或短片。
★ 134 📥 53,006
design-media

Nano Banana Pro

steipete
使用 Nano Banana Pro (Gemini 3 Pro Image) 生成或编辑图像。支持文生图、图生图及 1K/2K/4K 分辨率,适用于图像创建、修改及编辑请求,使用 --input-image 指定输入图像。
★ 430 📥 117,270
design-media

UI/UX Pro Max

xobi667
提供 UI/UX 设计智能与实现指导,帮助打造精美界面。适用于 UI 设计、UX 流程、信息架构、视觉风格、设计系统/标记、组件规格、文案/微文案、无障碍及前端 UI(HTML/CSS/JS、React、Next.js、Vue、Svelte
★ 219 📥 48,039