← 返回
未分类 中文

Ollama OCR

Use Ollama's vision/OCR models to recognize text from images. Supports glm-ocr, llava, moondream, and llama3.2-vision models. Ideal when you need local offli...
利用 Ollama 的视觉/OCR 模型从图像中识别文本。支持 glm-ocr、llava、moondream 及 llama3.2-vision 模型。适用于需要本地离线处理的场景。
hongjiahao371-pixel
未分类 clawhub v1.0.0 1 版本 100000 Key: 无需
★ 0
Stars
📥 638
下载
💾 17
安装
1
版本
#latest

概述

Ollama OCR Skill

Use this skill when you need to recognize text from images using Ollama's local vision/OCR models. No internet required - fully offline OCR.

When to Use

  • User sends an image and wants text extraction
  • User asks to recognize text from a screenshot or picture
  • Need local offline OCR without cloud API dependency
  • Processing sensitive images that shouldn't be sent to third parties

Models Available

ModelBest ForSize
-----------------------
glm-ocr:latestChinese text OCR~2.2GB
llava:7bGeneral image understanding~4.7GB
moondreamLightweight vision model~1.5GB
llama3.2-vision:latestLarge vision model~7GB+

Ollama Endpoint

Default: http://172.17.0.2:11434 (Docker container to host gateway)

Note: Endpoint is pre-configured for OpenClaw running in Docker accessing host Ollama. Adjust OLLAMA_HOST in ollama_ocr.py if your setup differs.

Usage

Command Line

python3 ollama_ocr.py /path/to/image.jpg [model_name]

Examples:

python3 ollama_ocr.py receipt.png glm-ocr:latest
python3 ollama_ocr.py screenshot.jpg llava:7b

Python API

from ollama_ocr import ollama_ocr

# Basic OCR with default model (glm-ocr)
result = ollama_ocr('/path/to/image.jpg')

# Specify model
result = ollama_ocr('/path/to/image.jpg', 'glm-ocr:latest')

print(result)

Example Prompts to Activate This Skill

  • "识别这张图片里的文字"
  • "帮我 OCR 一下这个截图"
  • "Extract text from this image"
  • "What text is in this screenshot?"

Notes

  • Image path must be absolute or relative to script location
  • For large images, consider resizing first to avoid timeout
  • glm-ocr works best for Chinese text
  • Some models may have output quirks (e.g., glm-ocr occasionally repeats)
  • First call may be slow if model isn't cached in memory

Requirements

  • Ollama installed and running
  • At least one vision/OCR model downloaded (e.g., ollama pull glm-ocr:latest)

版本历史

共 1 个版本

  • v1.0.0 当前
    2026-03-30 13:22 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

data-analysis

fund-monitor

hongjiahao371-pixel
基金监控Skill - 获取基金净值、涨跌数据,支持批量查询和监控
★ 0 📥 1,081
data-analysis

什么值得买价格爬取

hongjiahao371-pixel
什么值得买价格爬取Skill - 从smzdm.com爬取NAS、电脑、手机等商品价格
★ 0 📥 646

minimax-mcp-docker版(适配极空间)

hongjiahao371-pixel
MiniMax 图片理解 + 网络搜索 MCP 工具。适配 Docker 环境(极空间等),支持图片 OCR 识别、图像内容理解、网络搜索。API Key 安全存储在本地 credentials 文件,不暴露在代码中。
★ 0 📥 666