> Create, edit, transform, and analyze images with GPT-4o's native image generation API
Use this skill whenever the user needs to:
When the user describes an image they want to create:
generateImage() with the enhanced prompt:
```javascript
const result = await generateImage(enhancedPrompt, { size, quality, style });
```
./generated-images/ by default
When the user wants to modify an existing image:
editImage() with the source and instruction:
```javascript
const result = await editImage(imagePath, editInstruction, { mask: maskPath });
```
When the user asks about an image:
describeImage():
```javascript
const result = await describeImage(imageSource, question);
```
When the user needs multiple images:
generateImage() for each variant
When generating images, automatically enhance the user's prompt:
professional quality, high resolution, sharp details
| User Intent | Auto-Add |
|-------------|----------|
| Product photo | "studio lighting, clean background, commercial photography" |
| Portrait | "professional portrait photography, natural lighting" |
| Social media | "eye-catching, vibrant colors, modern design" |
| Illustration | "detailed illustration, professional artist quality" |
| Logo/branding | "clean vector style, scalable, minimal details" |
| Architecture | "architectural visualization, realistic rendering" |
| Food | "appetizing, food styling, professional food photography" |
| UI mockup | "clean design, modern interface, pixel-perfect" |
| Use Case | Recommended Size |
|----------|-----------------|
| Social media post | 1024x1024 (square) |
| Story/vertical | 1024x1792 |
| Banner/landscape | 1792x1024 |
| Product listing | 1024x1024 |
| Presentation | 1792x1024 |
| Wallpaper | 1792x1024 |
Quick style references for common requests:
| Preset Name | Style Description |
|-------------|-------------------|
| product | Clean white background, studio lighting, commercial photography |
| lifestyle | Natural setting, warm lighting, aspirational mood |
| minimalist | Simple composition, negative space, clean lines |
| vintage | Retro color grading, film grain, nostalgic mood |
| futuristic | Neon accents, dark background, sci-fi aesthetic |
| watercolor | Soft edges, pastel palette, artistic brush strokes |
| 3d-render | Octane render, realistic materials, dramatic lighting |
| anime | Japanese animation style, vibrant, expressive |
| sketch | Pencil drawing, hand-drawn, artistic |
| flat-design | Vector style, bold colors, geometric shapes |
generateImage(prompt, options)
Generate a new image from text description.
Parameters:
prompt (string) — Image description (auto-enhanced by this skill)
options (object):
size — 1024x1024 | 1024x1792 | 1792x1024 (default: 1024x1024)
quality — standard | hd (default: standard)
style — vivid | natural (default: vivid)
model — gpt-image-2 | dall-e-3 (default: gpt-image-2)
saveTo — File path to save the image (default: ./generated-images/)
Returns: { success, url, localPath, revisedPrompt }
editImage(imagePath, prompt, options)
Edit an existing image with natural language instructions.
Parameters:
imagePath (string) — Path to the source image
prompt (string) — Edit instruction
options (object):
mask — Path to mask image (white = edit area, black = keep)
size — Output size
model — gpt-image-2 | dall-e-3 (default: gpt-image-2)
Returns: { success, url, localPath }
generateVariations(imagePath, options)
Generate creative variations of an existing image.
Parameters:
imagePath (string) — Path to the source image
options (object):
count — Number of variations 1-4 (default: 2)
size — Output size
Returns: { success, variations: [{ url, localPath }] }
describeImage(imageSource, question)
Analyze an image using GPT-4o Vision.
Parameters:
imageSource (string) — File path or URL of the image
question (string|null) — Specific question about the image (default: general description)
Returns: { success, description }
downloadImage(url, savePath)
Download a generated image to local storage.
Parameters:
url (string) — Image URL from generation API
savePath (string|null) — Local file path (default: auto-generated in ./generated-images/)
Returns: { success, localPath }
| Error | Cause | Resolution |
|-------|-------|------------|
| Invalid API key | OPENAI_API_KEY not set or invalid | Check environment variable |
| Content policy violation | Prompt violates safety guidelines | Rephrase the prompt |
| Rate limit exceeded | Too many requests | Wait and retry with backoff |
| Image too large | Source image exceeds size limit | Resize to under 4MB |
| Timeout | Generation took too long | Simplify prompt or retry |
Tags: image-generation AI-art GPT-4o image-2 gpt-image-2 visual-creation marketing product-photos illustration design openai dall-e image-editing background-removal style-transfer ui-mockup
共 1 个版本