← 返回
内容创作 Key

PaddleOCR Text Recognition

Use this skill whenever the user wants text extracted from images, photos, scans, screenshots, or scanned PDFs. Returns exact machine-readable strings with l...
当用户想要从图像、照片、扫描件、截图或已扫描的 PDF 中提取文本时使用此技能。返回精确的机器可读字符串,...
bobholamovic
内容创作 clawhub v2.0.0 7 版本 98557.7 Key: 需要
★ 13
Stars
📥 6,095
下载
💾 649
安装
7
版本
#latest

概述

PaddleOCR Text Recognition

When to Use This Skill

Use this skill for:

  • Extract text from images (screenshots, photos, scans)
  • Extract text from PDFs or document images when the goal is line/box-level text
  • Extract text from URLs or local files that point to images/PDFs

Do not use for:

  • Documents with tables, formulas, charts, or complex layouts — use Document Parsing instead

Usage

Basic OCR

From URL:

paddleocr api \
  --model_type ocr \
  --file_url "https://example.com/image.png"

From local file:

paddleocr api \
  --model_type ocr \
  --file_path "./document.pdf"

Common Options

# With specific model
paddleocr api \
  --model_type ocr \
  --model PP-OCRv5 \
  --file_path "./report.pdf"

# Disable preprocessing (faster, for flat/well-oriented images)
paddleocr api \
  --model_type ocr \
  --file_path "./document.pdf" \
  --use_doc_unwarping False \
  --use_doc_orientation_classify False

# Save result to file
paddleocr api \
  --model_type ocr \
  --file_url "https://..." \
  --output result.json

# Page ranges
paddleocr api \
  --model_type ocr \
  --file_path "./large.pdf" \
  --page_ranges "1-5,10,15-20"

Output Format

{
  "jobId": "job-xxx",
  "pages": [
    {
      "prunedResult": {
        "rec_texts": ["Line 1", "Line 2"],
        "rec_scores": [0.98, 0.95]
      },
      "ocrImageUrl": "https://..."
    }
  ]
}

Important Notes

Preprocessing options: By default, the API enables document preprocessing (unwarping and orientation classification). For flat, well-oriented images (screenshots, properly scanned documents), you can disable preprocessing for faster results:

paddleocr api --model_type ocr --file_path "./document.pdf" --use_doc_unwarping False --use_doc_orientation_classify False

Keep preprocessing enabled when:

  • The input is a photo of a curved or folded document
  • The document has significant perspective distortion
  • Orientation is uncertain (rotated 90/180/270 degrees)

Display complete results: Always show the full extracted content to users. Do not truncate with "..." unless content exceeds 10,000 characters. When multiple pages are processed, summarize if needed but provide complete results when explicitly requested.

Handle errors gracefully: When the CLI returns an error, inform the user of the specific issue rather than silently failing or falling back to your own vision capabilities. Common errors:

  • Authentication: PADDLEOCR_ACCESS_TOKEN invalid or missing
  • Quota: API rate limit exceeded
  • No content detected: Image may be blank or contain no text

CLI Reference

Run paddleocr api --help for all options.

For full documentation, see: PaddleOCR Official Documentation

版本历史

共 7 个版本

  • v2.0.0 当前
    2026-06-07 05:20
  • v1.0.21
    2026-04-30 08:37 安全 安全
  • v1.0.13
    2026-03-28 19:25
  • v1.0.5
    2026-03-26 21:58
  • v1.0.9
    2026-03-18 08:37
  • v1.0.6
    2026-03-14 01:15
  • v1.0.1
    2026-03-11 17:54

安全检测

腾讯云安全 (Keen)

队列中

腾讯云安全 (Sanbu)

队列中

🔗 相关推荐

content-creation

Humanizer

biostartechnology
消除AI写作痕迹,使文本更自然真实。基于维基百科"AI写作特征"指南,识别并修正夸张象征、宣传用语、肤浅-ing分析、模糊归因、破折号滥用、三项排比、AI词汇、负面平行结构及冗长连接词等模式。
★ 857 📥 199,239
content-creation

Baidu Wenku AIPPT

ide-rea
使用百度文库 AI 智能生成 PPT,自动根据内容选择模板。
★ 66 📥 46,126
data-analysis

PaddleOCR Document Parsing

bobholamovic
使用此技能从 PDF 和文档图像中提取结构化的 Markdown/JSON,表格单元格级精确,公式转为 LaTeX,提取图像、印章、图表等。
★ 48 📥 14,273