Generate professional carousel images from text content using Gemini image generation API.
gemini-3-pro-image-preview (REQUIRED for correct Chinese/CJK text rendering)Determine carousel content from one of:
Collect:
For health/product carousels, use this proven 6-slide structure:
| # | Type | Purpose |
|---|---|---|
| --- | ------ | --------- |
| 1 | Cover | Hook + brand + topic |
| 2 | Problem | Why reader should care |
| 3 | Solution | How product/topic solves it |
| 4 | Details | Key features, data, ingredients |
| 5 | Social Proof | Testimonials, results, evidence |
| 6 | CTA | Product image + buy/contact |
For other structures, see references/prompt-patterns.md.
For each slide, write a Gemini prompt following these rules:
Design prompt structure:
Create a [SIZE] [STYLE_PRESET] Instagram slide for [BRAND].
LAYOUT:
- Background: [COLORS/GRADIENT]
- [ELEMENT DESCRIPTIONS WITH EXACT TEXT]
- "[SLIDE_NUM] / [TOTAL]" bottom right
CRITICAL: All Chinese/CJK text must be exactly as written above.
Key rules:
inlineData in API callreferences/prompt-patterns.mdUse scripts/generate_carousel.py or call Gemini API directly:
import urllib.request, json, base64
API_KEY = "..." # from TOOLS.md
MODEL = "gemini-3-pro-image-preview" # REQUIRED for CJK text
url = f"https://generativelanguage.googleapis.com/v1beta/models/{MODEL}:generateContent?key={API_KEY}"
parts = [{"text": prompt}]
# For CTA slide with product image:
# parts.insert(0, {"inlineData": {"mimeType": "image/jpeg", "data": base64_image}})
payload = {
"contents": [{"parts": parts}],
"generationConfig": {"responseModalities": ["image", "text"]}
}
data = json.dumps(payload).encode("utf-8")
req = urllib.request.Request(url, data=data, headers={"Content-Type": "application/json"})
resp = urllib.request.urlopen(req, timeout=180)
result = json.loads(resp.read())
Add 5-second delay between slides to avoid rate limits.
After generation, verify each slide with vision model:
If text is garbled, regenerate that slide. Pro model rarely fails on Chinese but verify anyway.
| Model | Chinese Text | Design Quality | Speed | Use When |
|---|---|---|---|---|
| ------- | ------------- | --------------- | ------- | ---------- |
gemini-3-pro-image-preview | ✅ Perfect | ✅ High | Slower | Default choice — CJK content |
gemini-2.5-flash-image | ❌ Garbled | ✅ High | Fast | English-only content |
gemini-3.1-flash-image-preview | ⚠️ Untested | ✅ High | Fast | Try for English content |
| Problem | Solution |
|---|---|
| --------- | ---------- |
| 429 quota exceeded | Check billing is linked to correct GCP project |
| Location not supported | Use US VPN |
| Chinese text garbled | Switch to gemini-3-pro-image-preview |
| Product image not matching | Attach actual product image via inlineData |
| Inconsistent design across slides | Include brand color hex codes and style description in every prompt |
text-to-carousel/
├── SKILL.md # This file
├── scripts/
│ └── generate_carousel.py # Batch generation script (config-driven)
└── references/
└── prompt-patterns.md # Design presets, slide templates, tips
共 1 个版本