概述

图片→文字桥接（本地优先）

Build a bounded local-first image-to-text bridge. This skill standardizes how screenshots, charts, and document images are converted into text-model-safe Markdown for downstream text-only models.

Core Promise

vision-context.md

图片关键内容 + OCR + 图表结构 + 关键信息不确定性

text-model-input.md

当输入是 Markdown 且使用 --compose 时，返回可直接喂给文本模型的一体化文稿

manifest.json

每次执行后的元数据与环境状态，便于快速回溯

Trigger

Use this skill when the task is to:

把图片/截图转给文本模型理解
转换 Markdown 文档中的图片到可读文本说明
在本地优先、低 token 条件下完成图片语义提取
需要为 DeepSeek/OpenCode 等文本模型补齐图片上下文

If the user only wants image generation or style rendering, use codex-image-bridge instead.

Non-Trigger Boundary

Do not use this skill for:

Prompt-to-image 任务（请走 codex-image-bridge）
图片编辑、OCR 后修图、裁剪、拼接
需要外部 API 转码但不产出 Markdown 的场景

Constraints

Image-to-text is privacy-first by default: local provider first, no image leaves the machine.
模型不稳定时按「本地 ollama → 更轻量本地模型 → 回退到 Codex vision」执行。
失败日志不应直接发给用户，优先输出可重跑建议和下一步动作。

Required Inputs

--check-env

检查 ollama_reachable、codex_exists 等关键状态

图片输入

--image

Markdown 输入（可选）

--markdown --compose

Output files are written under:

vision-context.md
text-model-input.md（compose 时）
manifest.json

脚本路径：

/Users/Admin/.agents/skills/codex-image-bridge-local/scripts/local_image_describe.py

Workflow

1) 环境检查

Run:

local_image_describe.py --check-env

If ollama_reachable is false, skip local image recognition and jump to Step 3.

2) 先本地识别

Default route:

local_image_describe.py --provider ollama --image "/path/to/screenshot.png"

Fallback sequence when local fails:

ollama 切换到 --ollama-model minicpm-v:8b（显存友好）
ollama 切换到 --ollama-model llama3.2-vision:11b（语义密度优先）
使用 --provider codex（云端回退）

3) 结果产出

For single-image usage:

local_image_describe.py --provider ollama --image "~/Desktop/架构图.png"

For article usage:

local_image_describe.py --provider ollama --markdown "/path/to/article.md" --compose

4) 后处理与投喂

将 vision-context.md 或 text-model-input.md 的内容复制给文本模型继续推理。

Provider Rules

ollama（默认）: 本地优先，主打隐私和稳定成本
codex（兜底）: 当本地链路连续 3 次尝试失败时使用
--ollama-model 仅在显存或稳定性问题时调整，不必每次替换

Common Failure Map

症状	处理
------	------
`ollama_connection_failed`	先 `ollama serve`，再重试
`model_not_found`	`ollama pull gemma4:12b`，再尝试 `minicpm-v:8b`
`ollama_empty_response`	换 `minicpm-v:8b`，必要时回退到 `codex`
`codex_timeout`	延长超时后重试（例如 `--timeout-seconds 360`）
图片缺失	先修正路径，再重跑
输出模糊	回退高精度模型或 codex 强制重跑

Output Contract

When returning user-facing results, include exactly:

provider 与 model 实际使用情况
成功输出文件名与路径
失败原因与下一步建议（若有）
是否建议再用 codex 回退

Red Flags

不要把原始图片直接喂给外部模型（除非用户明确要求）
不要在单步失败后重复同一失败参数
不要省略 --check-env 中的关键异常项

版本历史

共 1 个版本

v1.0.0 Initial release 当前

2026-06-08 19:31 安全安全

安全检测

腾讯云安全 (Keen)

安全，无风险

查看报告

腾讯云安全 (Sanbu)