← 返回
未分类

结构化JSON 提示词生成器 bozo-prompt-json

Ideogram 4.0 结构化 JSON 提示词生成器。当用户需要生成图像、想要 JSON 格式的 prompt、提到 Ideogram、需要结构化图像提示词、想控制色彩调色板/边界框布局/文本排版、或者提供文字描述或参考图片要求转换为标准 JSON prompt 时触发。
bozoyan
未分类 community v1.0.0 1 版本 94444.4 Key: 无需
★ 0
Stars
📥 17
下载
💾 0
安装
1
版本
#latest

概述

bozo-prompt-json — Ideogram 4.0 结构化 JSON 提示词生成器

将用户的自然语言描述或参考图片转换为 Ideogram 4.0 标准格式的结构化 JSON 提示词。

核心能力

能力说明
------------
色彩调色板控制每张图最多 16 种十六进制颜色,直接控制主色调
边界框布局任意元素可通过 [y_min, x_min, y_max, x_max] 定位(0–1000 归一化坐标)
文本元素渲染text 类型元素支持多行、多字体、带样式的图像内文本

工作流程

第一步:确定输入类型

输入 A — 纯文字描述:用户直接用自然语言描述想要的图像。

输入 B — 参考图片:用户提供一张图片作为参考(可能附带简短说明)。此时必须使用 VLM 模型分析图片内容,提取视觉信息后生成 JSON。

输入 C — 文字 + 参考图片:用户同时提供描述和参考图。以参考图的视觉信息为基础,结合用户文字进行增强和调整。

第二步:分析并构建 JSON

根据输入内容,按照以下 JSON Schema 构建输出:

{
  "high_level_description": "整体画面的一句话概括,包含主体、氛围、风格基调",
  "style_description": {
    "aesthetics": "美学风格关键词",
    "lighting": "光照描述",
    "photo": "摄影参数/胶片风格(如果是照片)",
    "medium": "媒介类型:Photograph / Illustration / Digital art 等",
    "art_style": "具体艺术风格",
    "color_palette": ["#HEX1", "#HEX2"]
  },
  "compositional_deconstruction": {
    "background": "背景的详细描述,包括环境、氛围、光影效果",
    "elements": [
      {
        "type": "obj | text",
        "bbox": [y_min, x_min, y_max, x_max],
        "desc": "对象/元素的详细视觉描述",
        "text": "仅 type=text 时填写,要渲染的文字内容"
      }
    ]
  }
}

第三步:输出规则

  1. 始终输出完整 JSON,不可省略任何顶层字段
  2. color_palette 是可选字段,但推荐在需要精确色彩控制时提供(每图最多 16 色)
  3. bbox 是可选字段,但推荐对关键元素提供精确定位
  4. type 字段:普通物体用 "obj",需要在图中渲染文字用 "text"
  5. text 元素必须同时包含 text(原始文字)和 desc(样式描述)两个字段
  6. 坐标系:[y_min, x_min, y_max, x_max],原点在左上角,范围 0–1000
  7. 元素描述要具体到可被模型理解的程度——材质、颜色、光影、位置关系都要写清楚

各字段撰写指南

high_level_description

一句话总览。包含:

  • 主体内容(什么场景/物体)
  • 视角和构图(close-up / wide shot / top-down 等)
  • 整体氛围和情绪
  • 关键风格标签

好示例

> "A cinematic 35mm film photograph of a lone wooden sailboat on a glassy lake at sunset, the boat on a right-third vertical with the horizon at the lower third, in a cool muted blue palette."

差示例

> "一艘船在湖上" (太模糊,缺少风格、构图、氛围信息)

style_description

各子字段不是都必须填写,根据实际需求选择:

子字段何时填写示例
------------------------
aesthetics总是建议填写"Cinematic, minimal, serene"
lighting有明确光照意图时"Cool overcast dusk light with a small warm sun low at the horizon"
photo摄影类图像时"35mm motion-picture film still, 16:9 framing, subtle grain"
medium总是建议填写"Photograph" / "Digital vector graphic" / "Ink and watercolor"
art_style非写实风格时"hand-drawn editorial illustration, flat color fills with subtle ink linework"
color_palette需要精确色控时["#1B3A5C", "#5B8FB9"]

compositional_deconstruction — background

背景描述要覆盖:

  • 环境/空间设定
  • 光照条件和大方向
  • 整体色调和质感
  • 大气效果(雾气、景深等)

compositional_deconstruction — elements

每个元素遵循以下原则:

  1. 从大到小排列:先主要物体,再次要物体,最后前景装饰
  2. 描述要有层次:形状 → 颜色 → 材质 → 光影 → 与其他元素的关系
  3. 数量适中:通常 5–15 个元素,过多会降低每个元素的权重;过少则信息不足
  4. text 元素特殊处理
    • text 字段:实际要显示的文字,\n 表示换行
    • desc 字段:字体、大小、颜色、排版方式的视觉描述
    • bbox:文字区域的定位框

输入处理策略

处理纯文字输入

直接将用户描述拆解为 JSON 的各个字段。如果用户描述比较模糊,需要合理推断并补充细节。例如:

用户输入:"一只猫坐在窗台上看书"

推断补充

  • 推断为温馨插画风格(而非超写实照片)
  • 补充窗户的光影效果
  • 补充书本的具体样子
  • 补充猫的姿态细节

处理参考图片输入

使用 VLM 分析图片后提取:

  1. 整体构图和视角
  2. 主要物体及其位置关系
  3. 色彩分布(提取主色调转为 HEX)
  4. 光照方向和强度
  5. 材质和纹理特征
  6. 如果有文字:字体、大小、颜色、位置

然后将以上信息重组为标准 JSON 格式。

处理文字 + 图片

以图片分析结果为基础骨架,用户文字作为修改指令覆盖或增强对应部分。优先级:用户明确指定 > 图片分析 > 合理推断。

完整示例

示例 1:风景照片

输入:"黄昏时分的海边灯塔,暖色调,电影感"

{
  "high_level_description": "A cinematic wide-angle photograph of a solitary lighthouse standing on a rocky coastline at golden hour, warm amber light bathing the stone tower, shot on Kodak Portra 400 with deep foreground rocks and a misty sea stretching to the horizon.",
  "style_description": {
    "aesthetics": "Cinematic, epic, solitary, warm nostalgia.",
    "lighting": "Golden hour directional light from the right, long shadows across the rocks, warm amber glow on the lighthouse face, cool blue shadows in the crevices.",
    "photo": "Medium format film photograph, 3:2 aspect ratio, shallow depth of field on the lighthouse, subtle grain, slightly desaturated except for warm highlights.",
    "medium": "Photograph.",
    "color_palette": ["#D4A574", "#2C3E50", "#E8B86D", "#1A252F"]
  },
  "compositional_deconstruction": {
    "background": "A rugged rocky coastline at golden hour, the sea stretching to a hazy horizon in the lower two-thirds of the frame. Warm directional light rakes across the rock faces from the right, casting long deep-blue shadows. Distant sea mist softens the horizon line. Cool-toned water with golden reflections near the shore.",
    "elements": [
      {
        "type": "obj",
        "bbox": [120, 380, 620, 620],
        "desc": "Tall cylindrical stone lighthouse anchored at the upper-right third, weathered white-gray masonry with horizontal banding, a dark lantern room capped with a domed glass roof at the top. Warm golden light catching the right-facing side, deep shadow on the left. A narrow gallery railing circles the lantern room."
      },
      {
        "type": "obj",
        "desc": "Beam of light cutting diagonally from the lantern room toward the upper-left, visible as a soft volumetric ray through the sea mist, fading gradually into the atmosphere."
      },
      {
        "type": "obj",
        "desc": "Rough granite boulders filling the foreground and midground, textured with cracks and patches of orange-brown lichen. Wet surfaces reflecting the golden sky. Sharp focus in the immediate foreground, gently softening toward the lighthouse base."
      },
      {
        "type": "obj",
        "desc": "Choppy sea in the lower third, deep teal-blue with scattered golden highlights from the sky reflection. White foam crests on distant waves. Soft motion blur suggesting a 1/4 second exposure."
      },
      {
        "type": "obj",
        "desc": "Pale amber-gold sky occupying the upper third, gradient from warm cream near the horizon to soft peach and faint violet at the top edge. No clouds, just a smooth atmospheric gradient with a hint of sea haze."
      }
    ]
  }
}

示例 2:海报设计(含文本元素)

输入:"做一个咖啡店开业海报,深绿色配金色文字"

{
  "high_level_description": "A sophisticated square 1:1 poster for a specialty coffee shop grand opening, featuring elegant serif typography in gold on a deep forest-green background with botanical coffee-plant illustrations and minimalist layout.",
  "style_description": {
    "aesthetics": "Sophisticated, artisanal, botanical elegance, minimal luxury.",
    "lighting": "Even studio lighting with subtle warm accent on gold elements, no strong directional source.",
    "medium": "Digital graphic design, square poster format.",
    "art_style": "Clean modernist design with botanical illustration accents, ample negative space.",
    "color_palette": ["#1A2E1A", "#D4AF37", "#F5F5DC", "#2D4A2D"]
  },
  "compositional_deconstruction": {
    "background": "Deep forest-green (#1A2E2A) solid background filling the entire frame edge-to-edge. Subtle texture of fine linen paper grain across the surface. Delicate faint botanical line-art pattern of coffee leaves and branches watermarking the background at very low opacity, especially concentrated near the edges.",
    "elements": [
      {
        "type": "text",
        "bbox": [80, 200, 420, 800],
        "text": "ARTISAN\nCOFFEE\nHOUSE",
        "desc": "Large display serif headline in metallic antique gold (#D4AF37), three lines stacked vertically, generous letter-spacing, slight embossed effect. Dominating the left-center area."
      },
      {
        "type": "text",
        "bbox": [480, 280, 540, 720],
        "text": "GRAND OPENING\nJUNE 15 · 2026",
        "desc": "Medium-sized serif text in cream (#F5F5DC), two lines, centered horizontally below the main headline. Elegant but restrained proportions."
      },
      {
        "type": "text",
        "bbox": [580, 350, 650, 650],
        "text": "SINGLE ORIGIN · HAND ROASTED\nDAILY FROM 7 AM",
        "desc": "Small tracked sans-serif caps in muted cream, two lines, centered. Supporting tagline beneath the date block."
      },
      {
        "type": "obj",
        "desc": "Delicate botanical illustration of a coffee plant branch with leaves and unripe berries, rendered in thin gold line-art, positioned as a decorative element wrapping around the right edge of the headline text area."
      },
      {
        "type": "obj",
        "desc": "Minimal line-art icon of a coffee cup with steam rising, small scale, in gold, positioned at the bottom center below all text blocks as a visual anchor."
      },
      {
        "type": "text",
        "bbox": [900, 220, 965, 780],
        "text": "123 ARTIST STREET · DOWNTOWN",
        "desc": "Tiny sans-serif address in pale cream at the very bottom, centered, the smallest type on the poster."
      }
    ]
  }
}

注意事项

  1. 不要过度填充:如果某个字段没有实质内容,宁可省略也不要凑数。比如非照片类图像不需要 photo 字段。
  2. 元素描述的质量 > 数量:3 个精心描述的元素比 10 个笼统描述效果好得多。
  3. 保持一致性high_level_descriptionbackground 和各 elementdesc 在风格、色调、光照上要保持一致。
  4. 利用调色板但不滥用:只在确实需要精确控制主色调时才提供 color_palette,让模型有合理的创作空间。
  5. 边界框是建议性的:模型会尊重 bbox 但不保证像素级精确,用它来表达"大致在这个区域"的意图即可。

版本历史

共 1 个版本

  • v1.0.0 Initial release 当前
    2026-06-06 10:49 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

七牛云存储返回URL

user_6419d865
将本地文件(图片、HTML 等)上传到七牛云存储,返回可在线访问的 URL。 Use when: 用户需要把生成的图片、HTML 页面或任意文件上传到云端以便分享链接;用户说「上传到七牛」「生成图片并上传」「把这个 HTML 上传到网上
★ 1 📥 76

短视频钩子方案生成 bozo-video-gz

user_6419d865
短视频钩子方案生成器。当用户想要做短视频、写视频脚本、构思视频选题、设计视频开头、优化视频留存率,或者提到"钩子""短视频""脚本""选题""文案"时,使用这个 skill 基于「痛点预警+分层勾魂」四层钩子框架为用户生成视频方案。即使用户
★ 1 📥 28

bizyair

user_6419d865
BizyAir 图片/视频/音频生成与 AI 应用执行。用户提到 BizyAir、要生成图片视频、发来 BizyAir 链接或 ID、要搜索 BizyAir 应用时调用。
★ 2 📥 79