← 返回
未分类 中文

Image Reader

Extract text from images using OCR (Optical Character Recognition). Use this skill when you need to read text content from images, screenshots, photos, or an...
使用OCR(光学字符识别)从图像中提取文本。当需要读取图像、截图、照片或其他图像中的文本内容时使用此技能。
rendaixue-byte rendaixue-byte 来源
未分类 clawhub v1.0.0 1 版本 99823.6 Key: 无需
★ 0
Stars
📥 566
下载
💾 0
安装
1
版本
#latest

概述

Image Reader - OCR Text Extraction

A high-performance OCR skill for extracting text from images. Powered by RapidOCR with PP-OCRv4 models, supporting Chinese and English text recognition.

Features

  • Multi-language: Chinese (simplified/traditional), English, and mixed text
  • High accuracy: PP-OCRv4 model with >95% accuracy on typical screenshots
  • Structured output: Text with confidence scores and bounding boxes
  • Image info: Dimensions, format, and color mode included
  • Fast: CPU-only, no GPU required

Quick Start

python scripts/read_image.py /path/to/image.jpg

Usage Examples

Extract text from a screenshot

python scripts/read_image.py screenshot.png

JSON Output

The script outputs structured JSON:

{
  "success": true,
  "text": "Full extracted text",
  "lines": [
    {
      "text": "Individual line",
      "confidence": 0.98,
      "box": [[x1,y1], [x2,y2], [x3,y3], [x4,y4]]
    }
  ],
  "line_count": 5,
  "image_info": {
    "format": "PNG",
    "size": [1920, 1080],
    "mode": "RGB"
  }
}

Requirements

pip install rapidocr onnxruntime pillow

First run will download OCR models (~50MB) automatically.

Common Use Cases

  • UI Screenshots: Extract text from app/website screenshots
  • Document Photos: Read text from photographed documents
  • Diagrams: Extract labels and annotations
  • Receipts: Parse receipt/invoice data

Output Fields

FieldTypeDescription
--------------------------
successboolWhether OCR succeeded
textstringAll extracted text
linesarrayIndividual text lines with metadata
line_countintNumber of text lines detected
image_infoobjectImage metadata

Technical Details

  • Engine: RapidOCR (ONNX Runtime backend)
  • Models: PP-OCRv4 (detection + recognition)
  • Languages: Chinese, English (auto-detected)
  • Performance: ~1-2 seconds per image on CPU

License

MIT License

Third-party dependencies:

  • RapidOCR - Apache 2.0 License
  • ONNX Runtime - MIT License

版本历史

共 1 个版本

  • v1.0.0 当前
    2026-05-03 07:58 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

design-media

Video Frames

steipete
使用 ffmpeg 从视频中提取帧或短片。
★ 133 📥 52,725
design-media

Nano Banana Pro

steipete
使用 Nano Banana Pro (Gemini 3 Pro Image) 生成或编辑图像。支持文生图、图生图及 1K/2K/4K 分辨率,适用于图像创建、修改及编辑请求,使用 --input-image 指定输入图像。
★ 427 📥 116,435
design-media

Openai Whisper

steipete
使用 Whisper CLI 进行本地语音转文字(无需 API 密钥)
★ 330 📥 93,293