概述

Vision Analysis

Analyze images using the MiniMax MiniMax_understand_image MCP tool available in the MiniMax Token Plan.

Prerequisites

MiniMax Token Plan subscription with valid MINIMAX_API_KEY
MiniMax MCP configured (MiniMax_understand_image tool available)

If MCP tool is not configured

Step 1: The agent should fetch setup instructions from:

https://platform.minimaxi.com/docs/token-plan/mcp-guide

Step 2: Detect the user's environment (OpenCode, Cursor, Claude Code, etc.) and output the exact commands needed. Common examples:

OpenCode — add to ~/.config/opencode/opencode.json or package.json:

{
  "mcp": {
    "MiniMax": {
      "type": "local",
      "command": ["uvx", "minimax-coding-plan-mcp", "-y"],
      "environment": {
        "MINIMAX_API_KEY": "YOUR_TOKEN_PLAN_KEY",
        "MINIMAX_API_HOST": "https://api.minimaxi.com"
      },
      "enabled": true
    }
  }
}

Claude Code:

claude mcp add -s user MiniMax --env MINIMAX_API_KEY=your-key --env MINIMAX_API_HOST=https://api.minimaxi.com -- uvx minimax-coding-plan-mcp -y

Cursor — add to MCP settings:

{
  "mcpServers": {
    "MiniMax": {
      "command": "uvx",
      "args": ["minimax-coding-plan-mcp"],
      "env": {
        "MINIMAX_API_KEY": "your-key",
        "MINIMAX_API_HOST": "https://api.minimaxi.com"
      }
    }
  }
}

Step 3: After configuration, tell the user to restart their app and verify with /mcp.

Important: If the user does not have a MiniMax Token Plan subscription, inform them that the understand_image tool requires one — it cannot be used with free or other tier API keys.

Analysis Modes

Mode	When to use	Prompt strategy
---	---	---
`describe`	General image understanding	Ask for detailed description
`ocr`	Text extraction from screenshots, documents	Ask to extract all text verbatim
`ui-review`	UI mockups, wireframes, design files	Ask for design critique with suggestions
`chart-data`	Charts, graphs, data visualizations	Ask to extract data points and trends
`object-detect`	Identify objects, people, activities	Ask to list and locate all elements

Workflow

Step 1: Auto-detect image

The skill triggers automatically when a message contains an image file path or URL with extensions:

.jpg, .jpeg, .png, .gif, .webp, .bmp, .svg

Extract the image path from the message.

Step 2: Select analysis mode and call MCP tool

Use the MiniMax_understand_image tool with a mode-specific prompt:

describe:

Provide a detailed description of this image. Include: main subject, setting/background,
colors/style, any text visible, notable objects, and overall composition.

ocr:

Extract all text visible in this image verbatim. Preserve structure and formatting
(headers, lists, columns). If no text is found, say so.

ui-review:

You are a UI/UX design reviewer. Analyze this interface mockup or design. Provide:
(1) Strengths — what works well, (2) Issues — usability or design problems,
(3) Specific, actionable suggestions for improvement. Be constructive and detailed.

chart-data:

Extract all data from this chart or graph. List: chart title, axis labels, all
data points/series with values if readable, and a brief summary of the trend.

object-detect:

List all distinct objects, people, and activities you can identify. For each,
describe what it is and its approximate location in the image.

Step 3: Present results

Return the analysis clearly. For describe, use readable prose. For ocr, preserve structure. For ui-review, use a structured critique format.

Output Format Example

For describe mode:

## Image Description

[Detailed description of the image contents...]

For ocr mode:

## Extracted Text

[Preserved text structure from the image]

For ui-review mode:

## UI Design Review

### Strengths
- ...

### Issues
- ...

### Suggestions
- ...

Notes

Images up to 20MB supported (JPEG, PNG, GIF, WebP)
Local file paths work if MiniMax MCP is configured with file access
The MiniMax_understand_image tool is provided by the minimax-coding-plan-mcp package

版本历史

共 1 个版本

v1.0.0 当前

2026-05-03 04:25 安全安全

安全检测

腾讯云安全 (Keen)

安全，无风险

查看报告

腾讯云安全 (Sanbu)