← 返回
未分类 Key 中文

OATDA Vision Analysis

Analyze images using vision-capable AI models through OATDA's unified API. Triggers when the user wants to analyze, describe, or understand images; extract t...
通过OATDA的统一API使用视觉AI模型分析图像。当用户想要分析、描述或理解图像时触发;提取...
devcsde
未分类 clawhub v1.0.6 1 版本 100000 Key: 需要
★ 0
Stars
📥 392
下载
💾 0
安装
1
版本
#latest

概述

OATDA Vision Analysis

Analyze images using vision-capable AI models through OATDA's unified API.

API Key Resolution

All commands need the OATDA API key. Resolve it inline for each exec call:

export OATDA_API_KEY="${OATDA_API_KEY:-$(cat ~/.oatda/credentials.json 2>/dev/null | jq -r '.profiles[.defaultProfile].apiKey' 2>/dev/null)}"

If the key is empty or null, tell the user to get one at https://oatda.com and configure it.

Security: Never print the full API key. Only verify existence or show first 8 chars.

Model Mapping

User saysProviderModel
----------------------------
gpt-4o (default)openaigpt-4o
gpt-4o-miniopenaigpt-4o-mini
claude, sonnetanthropicclaude-3-5-sonnet
geminigooglegemini-2.0-flash
gemini-1.5googlegemini-1.5-pro

Default: openai / gpt-4o if no model specified.

> ⚠️ Models update frequently. If a model ID fails, query oatda-list-models with ?type=chat for the latest vision-capable models.

Image URL Validation

  • Accept: https:// URLs or data:image/ base64 data URIs
  • Reject: http:// URLs, local file paths, internal IPs (localhost, 127.0.0.1, 169.254.x.x)
  • If user provides a local file, suggest converting to base64 first

API Call

CRITICAL: The endpoint is /api/v1/llm/image (NOT /api/v1/llm/generate-image — that's for image generation). The body uses a contents array, NOT a simple prompt string.

export OATDA_API_KEY="${OATDA_API_KEY:-$(cat ~/.oatda/credentials.json 2>/dev/null | jq -r '.profiles[.defaultProfile].apiKey' 2>/dev/null)}" && \
curl -s -X POST "https://oatda.com/api/v1/llm/image" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OATDA_API_KEY" \
  -d '{
    "provider": "<PROVIDER>",
    "model": "<MODEL>",
    "contents": [
      {"type": "text", "text": "<ANALYSIS_PROMPT>"},
      {"type": "image", "image": {"url": "<IMAGE_URL>", "detail": "auto"}}
    ]
  }'

Optional Parameters (add to body)

  • temperature: 0-2, default 0.7
  • maxTokens: Max response tokens

Image Detail Levels

  • "auto" — Let the model decide (default)
  • "low" — Faster, cheaper, less detail
  • "high" — More detail, higher cost (recommended for OCR)

Response Format

{
  "success": true,
  "provider": "openai",
  "model": "gpt-4o",
  "response": "The image shows a sunset over...",
  "usage": {
    "promptTokens": 800,
    "completionTokens": 200,
    "totalTokens": 1000
  },
  "costs": {
    "inputCost": 0.004,
    "outputCost": 0.006,
    "totalCost": 0.01,
    "currency": "USD"
  }
}

Present the response field to the user. Optionally mention token usage and cost.

Error Handling

HTTP StatusMeaningAction
------------------------------
401Invalid API keyTell user to check their key
400Bad requestCheck image URL is valid HTTPS, model supports vision
429Rate limitedWait 5 seconds and retry once

Example

User: "Describe this image: https://example.com/photo.jpg"

export OATDA_API_KEY="${OATDA_API_KEY:-$(cat ~/.oatda/credentials.json 2>/dev/null | jq -r '.profiles[.defaultProfile].apiKey' 2>/dev/null)}" && \
curl -s -X POST "https://oatda.com/api/v1/llm/image" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OATDA_API_KEY" \
  -d '{
    "provider": "openai",
    "model": "gpt-4o",
    "contents": [
      {"type": "text", "text": "Describe this image in detail"},
      {"type": "image", "image": {"url": "https://example.com/photo.jpg", "detail": "auto"}}
    ]
  }'

Notes

  • Endpoint is /api/v1/llm/image — NOT /api/v1/llm/generate-image (that's for generation)
  • Body uses contents array format, NOT a simple prompt string
  • Only HTTPS image URLs accepted — no HTTP, no local paths
  • Image tokens are included in prompt token count and affect cost
  • For OCR tasks, use "detail": "high"
  • Use oatda-generate-image for creating images
  • Use oatda-list-models for available vision models

版本历史

共 1 个版本

  • v1.0.6 当前
    2026-05-07 04:29 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

OATDA Generate Video

devcsde
通过 OATDA 统一 API,使用 AI 模型将文字描述生成视频。用户想生成、创建或制作 AI 视频时触发。
★ 0 📥 375

OATDA Text Completion

devcsde
使用 OATDA 统一 LLM API 生成文本;在用户希望使用特定 LLM 提供商(如 OpenAI、Anthrop...)生成、撰写或补全文本时触发。
★ 0 📥 387

OATDA List Models

devcsde
从 OATDA 的十余家提供商列出可用 AI 模型,支持按类型(聊天、图像、视频)或提供商名称筛选。当用户想查看模型时触发。
★ 0 📥 386