A high-performance OCR skill for extracting text from images. Powered by RapidOCR with PP-OCRv4 models, supporting Chinese and English text recognition.
python scripts/read_image.py /path/to/image.jpg
python scripts/read_image.py screenshot.png
The script outputs structured JSON:
{
"success": true,
"text": "Full extracted text",
"lines": [
{
"text": "Individual line",
"confidence": 0.98,
"box": [[x1,y1], [x2,y2], [x3,y3], [x4,y4]]
}
],
"line_count": 5,
"image_info": {
"format": "PNG",
"size": [1920, 1080],
"mode": "RGB"
}
}
pip install rapidocr onnxruntime pillow
First run will download OCR models (~50MB) automatically.
| Field | Type | Description |
|---|---|---|
| ------- | ------ | ------------- |
| success | bool | Whether OCR succeeded |
| text | string | All extracted text |
| lines | array | Individual text lines with metadata |
| line_count | int | Number of text lines detected |
| image_info | object | Image metadata |
MIT License
Third-party dependencies:
共 1 个版本