概述

webchat-image-support

Universal image understanding enhancement for OpenClaw. This skill enables image understanding across all channels (WebChat, Discord, Slack, etc.) and works with any model that supports image input.

What It Does

When users send images through any channel, this skill ensures the agent can understand and analyze them:

Automatic Detection: Detects when an inbound message contains images
Universal Support: Works with Claude, MiniMax, OpenAI, Gemini, or any vision-enabled model
Fallback Processing: If model doesn't support images, uses OpenClaw's built-in media understanding pipeline
Multi-Image Support: Handles multiple images in a single message

Requirements

Gateway with image support (OpenClaw 2026.3.29+)
At least one vision-capable model configured in models.json:

Claude (with vision)
MiniMax-VL-01
Gemini Pro Vision
GPT-4 Vision

Usage

No explicit commands needed. Just send images:

User: [sends a screenshot of error]
Agent: "我看到了错误信息：Unable to load script..."

User: [sends a photo]
Agent: "这张图片显示了一个卡通猪头..."

Configuration

Model Selection

For best results, use a vision-capable model. In ~/.openclaw/agents/main/agent/models.json:

{
  "providers": {
    "minimax": {
      "models": [
        {
          "id": "MiniMax-VL-01",
          "input": ["text", "image"]
        }
      ]
    }
  }
}

Default Behavior

Model Support	Behavior
---------------	----------
Model supports images	Direct image input to model
Model no images	Use media understanding pipeline

Troubleshooting

Q: Agent doesn't see images

A: Make sure your model supports image input (check input field in models.json)

Q: Images sent but no response

A: Check gateway logs for media processing errors

Q: Works in CLI but not WebChat

A: This skill requires OpenClaw 2026.3.29+ with the MediaPath fix

版本历史

共 1 个版本

v1.0.0 当前

2026-05-03 08:04 安全安全

安全检测

腾讯云安全 (Keen)

安全，无风险

查看报告

腾讯云安全 (Sanbu)