概述

Corespeed NanoBanana — Gemini Image & Text Generation

Auth: Set CS_AI_GATEWAY_BASE_URL and CS_AI_GATEWAY_API_TOKEN environment variables.

Workflow

Pick a model from the table below (default: gemini-2.5-flash-image for image generation)
Run the script with your prompt

Usage

uv run {baseDir}/scripts/gemini.py --prompt "your prompt" -f output.ext [-i input.ext] [--model MODEL]

--prompt, -p — Text prompt (required)
--filename, -f — Output filename (required)
--input, -i — Input image file(s), repeat for multiple
--model, -m — Model name (default: gemini-2.5-flash-image)
--modalities — Response type: auto, image, text, image+text (default: auto)
--json — Output structured JSON (recommended for agent consumption)

Output format is determined by file extension: .png/.jpg → image generation, .txt/.md → text output.

Image Generation

# Text-to-image
uv run {baseDir}/scripts/gemini.py -p "a watercolor fox in autumn forest" -f fox.png

# Image editing
uv run {baseDir}/scripts/gemini.py -p "Remove background, add beach sunset" -f edited.png -i photo.jpg

# Multi-image compositing
uv run {baseDir}/scripts/gemini.py -p "Blend these two scenes together" -f blend.png -i scene1.png -i scene2.png

Image Analysis

# Describe an image
uv run {baseDir}/scripts/gemini.py -p "Describe this image" -f desc.txt -i photo.jpg --model gemini-2.5-flash

# Compare images
uv run {baseDir}/scripts/gemini.py -p "What are the differences?" -f diff.txt -i before.jpg -i after.jpg --model gemini-2.5-flash

Text Generation

# Use the most capable model for complex tasks
uv run {baseDir}/scripts/gemini.py -p "Write a haiku about coding" -f haiku.txt --model gemini-2.5-pro

Models

Model	Type	Best For
-------	------	----------
gemini-2.5-flash-image	Image + Text	Image generation & editing (default)
gemini-2.5-flash	Text	Fast analysis, vision, general tasks
gemini-2.5-pro	Text	Complex reasoning, highest quality
gemini-2.5-flash-lite	Text	Fastest, simple tasks

Notes

No manual Python setup required. The script uses PEP 723 inline metadata. uv run automatically creates an isolated virtual environment and installs the google-genai dependency on first run.
Image output is returned inline as base64 from the Gemini API — no separate download step.
Use timestamps in filenames: yyyy-mm-dd-hh-mm-ss-name.ext.
Script prints MEDIA: line for OpenClaw to auto-attach generated images.
Do not read generated media back; report the saved path only.
Only gemini-2.5-flash-image can generate images. Other models are text-only.
Use --json for structured output: {"ok": true, "files": [...], "text": "...", "model": "...", "tokens": {...}}

Support

Built by Corespeed. If you need help or run into issues:

💬 Discord: discord.gg/mAfhakVRnJ
🐦 X/Twitter: @CoreSpeed_io
🐙 GitHub: github.com/corespeed-io/skills

版本历史

共 1 个版本

v0.0.2 当前

2026-03-30 22:13 安全安全

安全检测

腾讯云安全 (Keen)

安全，无风险

查看报告

腾讯云安全 (Sanbu)