Handle image generation, image editing, and short video generation through one workflow: choose the right modality, pass caller intent through to the provider, save outputs under tmp/images/ or tmp/videos/, and prefer the bundled helpers over ad-hoc one-off API calls.
scripts/generate_image.py for still-image generation.scripts/edit_image.py for direct image edits.scripts/mask_inpaint.py for localized edits with masks or generated regions.scripts/outpaint_image.py for canvas expansion / outpainting.scripts/reference_media.py when reference images need to be passed through.scripts/generate_video.py for video generation, especially when the provider may return async job payloads.scripts/generate_batch_media.py for repeatable batch jobs, templated variations, or auditable manifests.scripts/object_select_edit.py for simple object-vs-background edits on transparent assets or clean backdrops.data: URL, or b64_json, use scripts/fetch_generated_media.py.tmp/images/tmp/videos/Default to prompt pass-through.
Use the scripts mainly as functional helpers:
tmp/images/ or tmp/videos/tmp/images/.tmp/videos/.Use the smallest helper that matches the request:
scripts/generate_image.py → direct still-image generationscripts/edit_image.py → direct full-image editsscripts/mask_inpaint.py → localized edits with an explicit or generated maskscripts/outpaint_image.py → canvas expansion before an edit callscripts/reference_media.py → reference-image transport and delegationscripts/generate_consistent_media.py → backward-compatible wrapper onlyscripts/generate_batch_media.py → repeatable manifest-driven batchesscripts/object_select_edit.py → simple object-vs-background edits on transparent or clean-backdrop assetsscripts/generate_video.py → direct video generation and async pollingscripts/fetch_generated_media.py → normalize returned media refs into local filesUse references/model-capabilities.md when deciding which helper fits the modality, transport, or return shape.
Use references/reference-image-workflow.md for reference-image transport details.
Use references/batch-workflows.md for manifest structure and batch execution behavior.
Minimal examples:
python3 skills/media-generation/scripts/generate_image.py \
--prompt 'person' \
--size '1024x1024' \
--out-dir 'tmp/images' \
--prefix 'generated'
python3 skills/media-generation/scripts/edit_image.py \
--image 'tmp/images/source.jpg' \
--prompt 'replace the background' \
--out-dir 'tmp/images' \
--prefix 'edited'
python3 skills/media-generation/scripts/mask_inpaint.py \
--image 'tmp/images/source.jpg' \
--x 120 --y 80 --width 220 --height 180 \
--prompt 'replace the masked area' \
--out-dir 'tmp/images' \
--prefix 'mask-result'
python3 skills/media-generation/scripts/outpaint_image.py \
--image 'tmp/images/source.jpg' \
--left 512 --right 512 --top 128 --bottom 128 \
--mode blur \
--prompt 'extend outward' \
--out-dir 'tmp/images' \
--prefix 'outpaint-result'
python3 skills/media-generation/scripts/reference_media.py \
--mode image \
--reference-image 'tmp/images/reference.png' \
--prompt 'character' \
--size '1024x1024' \
--out-dir 'tmp/images' \
--prefix 'reference-output'
python3 skills/media-generation/scripts/generate_batch_media.py \
--manifest 'tmp/images/media-batch.jsonl' \
--vars-json '{"subject":"item"}' \
--summary-out 'tmp/images/media-batch-summary.json' \
--continue-on-error \
--print-json
python3 skills/media-generation/scripts/object_select_edit.py \
--image 'tmp/images/product.png' \
--selection-mode alpha \
--edit-target background \
--prompt 'replace the background' \
--out-dir 'tmp/images' \
--prefix 'product-bg-edit'
python3 skills/media-generation/scripts/generate_video.py \
--prompt 'motion clip' \
--size '720x1280' \
--seconds 6 \
--out-dir 'tmp/videos' \
--prefix 'generated-video'
Before blaming the skill, check these first:
config.models.providers. existsbaseUrl and apiKey--extra-json or --extra-json-file match that provider's schemaDefaults used by the bundled scripts:
~/.openclaw/openclaw.json or $OPENCLAW_CONFIG$OPENCLAW_MEDIA_PROVIDER, otherwise the first provider found in config--model$OPENCLAW_MEDIA_IMAGE_MODEL or image-model$OPENCLAW_MEDIA_EDIT_MODEL or image-edit-model$OPENCLAW_MEDIA_VIDEO_MODEL or video-modeltmp/ or $MEDIA_GENERATION_OUTPUT_ROOT--out-dirCommon failure patterns:
provider not found → pass --provider explicitly or set $OPENCLAW_MEDIA_PROVIDERimage-model / image-edit-model / video-model) → pass --model explicitly or set the matching $OPENCLAW_MEDIA_*_MODEL env varconfig not found / invalid JSON → pass --config explicitly or fix the OpenClaw config file--endpoint and video polling paths--extra-json / --extra-json-fileapiKeyUse --print-json when debugging so the response body, resolved endpoint, and failure hints stay visible.
Read these selectively:
references/model-capabilities.mdreferences/reference-image-workflow.mdreferences/batch-workflows.mdPrimary helpers:
scripts/generate_image.pyscripts/edit_image.pyscripts/mask_inpaint.pyscripts/outpaint_image.pyscripts/reference_media.pyscripts/generate_consistent_media.pyscripts/generate_video.pyscripts/generate_batch_media.pyscripts/object_select_edit.pyscripts/prepare_object_mask.pyscripts/media_request_common.pyscripts/smoke_test.pyscripts/fetch_generated_media.py共 2 个版本