Analyze images using Kimi K2.5 multimodal vision capabilities through Ollama Cloud API.
python3 ~/.openclaw/workspace/skills/vision-analyzer/scripts/vision_analyze.py <image_path> [prompt]
Describe an image:
python3 ~/.openclaw/workspace/skills/vision-analyzer/scripts/vision_analyze.py photo.jpg
Ask specific question:
python3 ~/.openclaw/workspace/skills/vision-analyzer/scripts/vision_analyze.py screenshot.png "What UI elements do you see?"
/mnt/chromeos/MyFiles/Downloads//mnt/chromeos/MyFiles/Downloads/~/Set your Ollama API key as environment variable:
export OLLAMA_API_KEY="your-api-key-here"
Get your API key from ollama.com/settings
The skill uses Ollama Cloud API with Kimi K2.5 model.
API key is read from OLLAMA_API_KEY environment variable.
Returns a natural language description of the image content.
共 1 个版本