← 返回
未分类 Key

GPT Image 2 API

Generate and edit images via OpenAI gpt-image-2 model. Agent-agnostic CLI — works with any AI agent (Hermes, Claude Code, Codex, OpenClaw, etc.). Supports co...
通过OpenAI gpt-image-2模型生成和编辑图片。通用CLI,支持任意AI代理(Hermes、Claude Code、Codex、OpenClaw等)。
jancong
未分类 clawhub v2.0.0 1 版本 100000 Key: 需要
★ 0
Stars
📥 433
下载
💾 0
安装
1
版本
#latest

概述

gpt-image-2

Generate and edit images via OpenAI's gpt-image-2 model. Agent-agnostic — designed to work with any AI agent or standalone from the command line.

Quick Start

# 1. Initialize config (one-time)
python3 gpt_image2.py config --init

# 2. Edit the config to set your API key
#    ~/.config/gpt-image-2/config.json

# 3. Generate
python3 gpt_image2.py generate "A cute cat on a windowsill" -o ~/cat.png --quality low

# 4. Edit
python3 gpt_image2.py edit input.png "Change the sofa to green" -o ~/output.png

Configuration

Config priority: --config flag > --base-url/--api-key flags > config file > environment variables > defaults.

Config File Locations (in priority order)

PriorityPathNotes
-----------------------
1$XDG_CONFIG_HOME/gpt-image-2/config.jsonXDG standard (recommended)
2~/.config/gpt-image-2/config.jsonDefault XDG fallback
3~/.gpt-image-2-config.jsonSingle-file fallback
4~/.hermes/gpt-image-2-config.jsonLegacy Hermes compat

Use python3 gpt_image2.py config --show to see which config is active.

Config File Format

{
  "base_url": "https://api.openai.com/v1",
  "api_key_env": "OPENAI_API_KEY"
}
FieldTypeDescription
--------------------------
base_urlstringAPI base URL. Default: https://api.openai.com/v1
api_keystringPlaintext API key (not recommended — visible in file)
api_key_envstringEnvironment variable name holding the key (recommended)

Environment Variables (fallback when no config file)

VariablePurpose
-------------------
GPT_IMAGE2_API_KEYAPI key
GPT_IMAGE2_BASE_URLAPI base URL

Config Management Commands

# Create template config
python3 gpt_image2.py config --init

# Show active config (keys are masked)
python3 gpt_image2.py config --show

# Overwrite config
python3 gpt_image2.py config --init --force

CLI Reference

generate — Text-to-Image

python3 gpt_image2.py generate "prompt" [options]
OptionDefaultDescription
------------------------------
-o, --output~/gpt-image2-output.pngOutput file path
--qualityautolow (~70s), medium (~120s), high (~276s)
--sizeauto1024x1024, 1536x1024, 1024x1536
--formatpngpng, jpeg, webp
--n1Number of images (1-10)
--timeout600curl timeout in seconds
--configauto-detectExplicit config file path
--base-urlfrom configOverride API base URL
--api-keyfrom configOverride API key (visible in ps!)

edit — Image-to-Image

python3 gpt_image2.py edit <image_path> "edit prompt" [options]
OptionDefaultDescription
------------------------------
--masknonePNG mask (transparent=edit area)
--moderationautolow or auto
(all generate options also apply)

config — Manage Configuration

python3 gpt_image2.py config [--init] [--show] [--force] [--config PATH]

Script Location

The script is at scripts/gpt_image2.py relative to this skill directory.

To find it programmatically from any agent:

# If installed as a Hermes skill:
SCRIPT="$(dirname "$(readlink -f "$0")")/../skills/creative/gpt-image-2/scripts/gpt_image2.py"

# Or copy/symlink it anywhere — it's self-contained with zero dependencies beyond stdlib + curl
cp scripts/gpt_image2.py /usr/local/bin/gpt-image2

The script has zero pip dependencies — only Python 3.8+ stdlib and curl.

API Reference

Generations (Text-to-Image)

ItemValue
-------------
EndpointPOST {base_url}/images/generations
AuthAuthorization: Bearer {api_key}
Content-Typeapplication/json

Edits (Image-to-Image)

ItemValue
-------------
EndpointPOST {base_url}/images/edits
AuthAuthorization: Bearer {api_key}
Content-Typemultipart/form-data

Parameters

Generations (JSON body):

ParamTypeRequiredDescription
------------------------------------
modelstringyesgpt-image-2
promptstringyesText description
nintnoNumber of images (default 1)
sizestringno1024x1024, 1536x1024, 1024x1536
qualitystringnolow, medium, high (default auto)
formatstringnopng, jpg, webp (default png)

Edits (form-data):

ParamTypeRequiredDescription
------------------------------------
modelstringyesgpt-image-2
promptstringyesEdit instruction
imagefileyesSource image (PNG, max 4 images)
nintnoNumber of outputs (default 1)
sizestringno1024x1024, 1536x1024, 1024x1536, or auto
qualitystringnolow, medium, high (default auto)

Agent Integration Guide

This skill is designed to be agent-agnostic. Any AI agent can use it by:

  1. Locate the script: Find gpt_image2.py in the skill's scripts/ directory
  2. Call via shell: python3 /gpt_image2.py generate "prompt" -o output.png
  3. Parse stdout: The script prints Saved: ( KB) on success

Integration Examples

Hermes / Claude Code / Codex / OpenClaw:

python3 /path/to/gpt-image-2/scripts/gpt_image2.py generate "prompt" -o output.png --quality low

From Python (any agent):

import subprocess, json
result = subprocess.run(
    ["python3", script_path, "generate", prompt, "-o", output_path, "--quality", "low"],
    capture_output=True, text=True, timeout=600
)
# Parse result.stdout for "Saved: <path>"

From Node.js / TypeScript:

const { execSync } = require('child_process');
const output = execSync(`python3 ${scriptPath} generate "${prompt}" -o ${outputPath}`);
// Parse output.toString() for "Saved: ..."

Workflow: Agent Generates Images

  1. Always use the CLI script — handles config resolution, auth security, and response parsing
  2. Use low quality for drafts, high quality for final output
  3. For edits: --size auto preserves original dimensions (recommended)
  4. The script outputs: HTTP status, time elapsed, output file path and size
  5. Parse the output: look for Saved: lines to find generated files

Workflow: Agent Edits Existing Images

  1. Save or locate the source image path
  2. Call gpt_image2.py edit "" --output
  3. Edit endpoint can accept up to 4 images via repeated --image flags
  4. Use --size auto to preserve original dimensions

Important Pitfalls

  1. --api-key flag is visible in shell history and ps aux — prefer config file (api_key_env) or environment variables.
  2. The edits endpoint does NOT support response_format — always returns b64_json regardless.
  3. gpt-image-2 generations may time out on some relay endpoints — use --timeout flag (default 600s).
  4. Prompt with special characters — the script writes prompts to temp files internally, avoiding shell escaping issues. No need to worry about quoting.
  5. Authorization header is never passed via -H — the script uses curl -K temp config file, deleted immediately after use. Keys never appear in ps aux.
  6. Config file permissions — the script warns if config has group/other read permissions. Run chmod 600 to fix.
  7. Zero pip dependencies — the script only requires Python 3.8+ stdlib and curl. No installation step needed.
  8. Chinese text in prompts may not render correctly — gpt-image-2's Chinese rendering is unstable; it often ignores Chinese constraints and outputs English text in images. Consider using Gemini for Chinese text rendering.

版本历史

共 1 个版本

  • v2.0.0 当前
    2026-05-07 13:49 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

design-media

UI/UX Pro Max

xobi667
提供 UI/UX 设计智能与实现指导,帮助打造精美界面。适用于 UI 设计、UX 流程、信息架构、视觉风格、设计系统/标记、组件规格、文案/微文案、无障碍及前端 UI(HTML/CSS/JS、React、Next.js、Vue、Svelte
★ 227 📥 48,901
design-media

Nano Banana Pro

steipete
使用 Nano Banana Pro (Gemini 3 Pro Image) 生成或编辑图像。支持文生图、图生图及 1K/2K/4K 分辨率,适用于图像创建、修改及编辑请求,使用 --input-image 指定输入图像。
★ 435 📥 117,905
design-media

Openai Whisper

steipete
使用 Whisper CLI 进行本地语音转文字(无需 API 密钥)
★ 335 📥 94,730