← 返回
未分类 Key

Nano Banana

Generate or edit images using Google's "Nano Banana" image models (Gemini 2.5 / 3.x image previews). Use this when the user explicitly asks for Gemini / Nano...
使用谷歌的 Nano Banana 图像模型(Gemini 2.5/3.x 预览)生成或编辑图片,适用于用户明确要求 Gemini/Nano 的场景。
dyagil
未分类 clawhub v1.0.0 1 版本 99497.5 Key: 需要
★ 0
Stars
📥 198
下载
💾 0
安装
1
版本
#latest

概述

Nano Banana — Gemini Image Generation

A CLI skill for generating and editing images using Google's Gemini image models, marketed as "Nano Banana" and "Nano Banana Pro".

When to Use

Use this skill when the user explicitly asks for:

  • "Nano Banana" / "Nano Banana Pro"
  • "Generate the image with Gemini"
  • Image edits / composition using one or more reference images
  • Iterative edits in conversation ("now make her smile", "swap the background")

For generic "make me an image" requests with no provider hint, prefer your platform's built-in image generation unless the user has set Nano Banana as the default.

Model Map (Nano Banana names → Gemini model IDs)

NicknameModel IDNotes
---------
Nano Banana Pro ← defaultgemini-3-pro-image-previewHighest quality, slower, multi-image edits
Nano Banana 3.1 Flashgemini-3.1-flash-image-previewFaster, newer, good for iteration
Nano Banana (classic)gemini-2.5-flash-imageOriginal, cheapest, still solid

If unspecified, default to Pro.

Auth

Store the Gemini API key at ~/.openclaw/credentials/google/gemini_api_key (chmod 600).

Load it before invoking the CLI:

export GEMINI_API_KEY=$(cat ~/.openclaw/credentials/google/gemini_api_key)

⚠️ Never log or echo the key. Never paste it into chat. If rotated, replace the file contents.

Get a key at: https://aistudio.google.com/apikey

CLI

Binary: ~/bin/nano-banana (Node.js). Source lives at /nano-banana.js.

Generate from prompt

~/bin/nano-banana "a person eating a red apple, photorealistic, warm light"
# Writes: ~/.../nano-banana-output/YYYY-MM-DD_HHMMSS.png
# Prints the absolute path on success.

Choose model / aspect / count

~/bin/nano-banana --model pro "logo for a modern insurance agency, olive green"
~/bin/nano-banana --model flash --aspect 16:9 "morning marathon in a coastal city"
~/bin/nano-banana --model classic --count 4 "red pepper on a wooden table"
~/bin/nano-banana --aspect 1:1 --out /tmp/x.png "..."

Flags:

  • --model pro|flash|classic| (default: pro)
  • --aspect 1:1|4:3|3:4|16:9|9:16|3:2|2:3 (best-effort prompt hint)
  • --count N (1–4, default 1)
  • --out (only valid with --count 1)
  • --out-dir (default: /nano-banana-output/)
  • --quiet (print only resulting paths, one per line)

Edit with reference images

~/bin/nano-banana --ref photo.jpg "put a red baseball cap on the person"
~/bin/nano-banana --ref a.png --ref b.png "merge the two images, library background"

How Your Agent Should Use This

  1. Read the request and pick a model. "Pro" → pro. "Fast" / "Flash" → flash. Otherwise → pro.
  2. Write the prompt in English. Gemini image models work best in English even when the chat is in another language. Translate any non-English description into a clear, descriptive English prompt.
  3. Run the CLI via exec. Capture the output path.
  4. Deliver the image by adding MEDIA: on its own line in the reply (or use your platform's attachment convention).
  5. Reply with a short caption + which model was used.

Example agent flow

User: "Draw a person eating a red apple."

~/bin/nano-banana --aspect 4:3 \
  "A realistic portrait of a person taking a bite from a bright red apple, natural daylight, soft shadows, sharp focus, casual modern clothing, slight motion blur on the hand, warm color grading"
# → /home/<user>/.../nano-banana-output/2026-05-11_141815.png

Reply:

> Here's the apple shot — Nano Banana Pro

> MEDIA:/home//.../nano-banana-output/2026-05-11_141815.png

API Contract (what the CLI does)

Under the hood, the CLI POSTs to:

POST https://generativelanguage.googleapis.com/v1beta/models/{model}:generateContent?key={KEY}

Body:

{
  "contents": [{"role": "user", "parts": [{"text": "<prompt>"}]}],
  "generationConfig": {
    "responseModalities": ["IMAGE", "TEXT"],
    "candidateCount": <N>
  }
}

Response candidates contain parts[].inlineData.{mimeType, data} (base64). The CLI decodes and writes to disk.

For reference images, additional parts with inlineData are prepended before the text prompt.

Errors & Gotchas

  • PERMISSION_DENIED / API_KEY_INVALID → key rotated. Update the credentials file.
  • RESOURCE_EXHAUSTED → free-tier quota hit. Wait, or switch to flash / classic.
  • SAFETY block → response has promptFeedback.blockReason and no image. Rewrite the prompt and retry.
  • No image in response → model returned only text. Usually means the prompt was ambiguous or refused. Add "generate an image of..." explicitly.
  • Aspect ratio is best-effort — Gemini doesn't expose a strict aspectRatio field on image models, so the CLI appends a textual hint.

File Layout

<your-skills-dir>/nano-banana/
  SKILL.md                              ← this file
<your-tools-dir>/
  nano-banana.js                        ← the CLI implementation
  nano-banana-output/                   ← generated images
~/bin/nano-banana                       ← symlink to nano-banana.js
~/.openclaw/credentials/google/
  gemini_api_key                        ← the secret (chmod 600)

版本历史

共 1 个版本

  • v1.0.0 当前
    2026-05-21 15:47 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

Oganim Deploy

dyagil
开发、部署并验证对 Vercel + Supabase 项目(营销网站、CRM 和客户门户)的更改。适用于代理需要发布代码时。
★ 0 📥 301

Magic Link Bridge

dyagil
生成 Supabase 魔术链接,直接跳转到自定义门户子路径(例如 /portal/),而不是被静默重写为项目站点 URL
★ 0 📥 241

Supabase Security Audit

dyagil
审计 Supabase + Vercel 项目的 RLS 覆盖、权限提升、跨客户数据泄露、匿名暴露、魔法链接流程正确性以及 HTT 相关安全。
★ 0 📥 295