← 返回
效率工具 中文

Super Ocr

Production-grade OCR with intelligent engine selection. Tesseract (lightweight, fast) and PaddleOCR (high accuracy, Chinese-optimized). Use when extracting t...
生产级OCR,智能引擎选择。Tesseract轻量快速,PaddleOCR高精度且中文优化。适用于文本提取。
nimachu
效率工具 clawhub v0.1.0 1 版本 100000 Key: 无需
★ 1
Stars
📥 865
下载
💾 183
安装
1
版本
#latest

概述

Super OCR

Overview

Super OCR is a production-grade optical character recognition tool that intelligently selects the best engine for your needs:

  • Tesseract Engine: Lightweight, fast (~200-500ms), perfect for simple text extraction
  • PaddleOCR Engine: High accuracy (98%+), optimized for Chinese, ideal for complex documents

Engine Selection Strategy

Auto Mode (Default)

The skill automatically selects the optimal engine:

ScenarioSelected EngineWhy
-------------------------------
Simple text, English onlyTesseractFaster, lighter dependency
Chinese content, high accuracy neededPaddleOCRBetter Chinese support, 98%+ accuracy
Low confidence from TesseractPaddleOCR (fallback)Quality assurance

Force Mode

Users can explicitly choose an engine:

  • --engine tesseract - Use Tesseract only
  • --engine paddle - Use PaddleOCR only
  • --engine auto - Auto-select (default)

Quick Start

Installation

This skill requires the following dependencies:

  • PaddleOCR (for Chinese text recognition - 98%+ accuracy)
  • Tesseract (for fast English text recognition)
  • OpenCV (for image preprocessing)

Option 1: Install with pip (all-in-one)

pip install paddleocr paddlepaddle pytesseract pillow opencv-python numpy

Option 2: Install dependencies manually

macOS:

# Tesseract
brew install tesseract

# PaddleOCR
pip install paddleocr paddlepaddle

Ubuntu/Debian:

# Tesseract
sudo apt update && sudo apt install tesseract-ocr

# PaddleOCR
pip install paddleocr paddlepaddle

Windows:

# Download Tesseract from: https://github.com/UB-Mannheim/tesseract/wiki
pip install paddleocr paddlepaddle pytesseract pillow opencv-python numpy

Usage

# Auto mode (recommended) - runs all available engines
cd path/to/super-ocr
python scripts/main.py --image path/to/image.png

# Force Tesseract only
python scripts/main.py --image document.jpg --engine tesseract

# Force PaddleOCR (high accuracy Chinese)
python scripts/main.py --image chinese_menu.png --engine paddle

# Run all engines (macOS only: Tesseract + PaddleOCR + MacVision)
python scripts/main.py --image complex_doc.png --engine all

# Batch processing with output directory
python scripts/main.py --images ./images/*.png --output ./results --verbose

# Check dependencies and auto-install
python scripts/dependencies.py --check --install

Structuring This Skill

This skill uses a capabilities-based structure with multiple execution modes:

  1. Engine Selection Logic - Intelligent decision making
  2. OCR Execution - Unified interface for different engines
  3. Post-processing - Standardized output formatting
  4. Validation & Fallback - Quality assurance

Core Capabilities

1. Intelligent Engine Selection

The skill includes a decision tree that analyzes:

  • Image characteristics (contrast, text size)
  • Language patterns (Chinese character detection)
  • User requirements (speed vs accuracy)

See scripts/engine_selector.py for implementation details.

2. Dual Engine Support

Tesseract Engine (scripts/tesseract_ocr.py):

  • Fast preprocessing pipeline
  • PSM mode 6 for uniform text blocks
  • Confidence scoring per word
  • Language detection

PaddleOCR Engine (scripts/paddle_ocr.py):

  • State-of-art? SN (East text detection)
  • Crnn recognition with LSTM
  • Confidence scores per character
  • Table detection support

3. Output Formats

Supports multiple output formats:

FormatContentUse Case
---------------------------
Text onlyClean extracted textSimple search/grep
StructuredText + positionsData extraction
JSONFull metadata + confidenceAPI integration
VerboseDebug infoQuality assurance

4. Quality Guarantees

  • Confidence thresholds (configurable, default 80%)
  • Low-confidence alerts for manual review
  • \Fallback processing for failed OCRs

Resources

scripts/

  • main.py - Main entry point, CLI interface (supports multi-engine)
  • dependencies.py - Auto-install and validation
  • output_formatter.py - Multiple output format support
  • engine/ - OCR engine implementations
  • selector.py - Intelligent engine selection logic
  • tesseract.py - Tesseract engine wrapper
  • paddle.py - PaddleOCR engine wrapper
  • macvision.py - macOS Vision OCR (macOS only)
  • preprocessing/ - Image preprocessing utilities
  • preprocessor.py - Denoising, enhancement, binarization

dependencies.py (Key Feature)

The dependencies.py module handles:

  • Dependency detection (paddleocr, paddlepaddle, pytesseract, cv2)
  • Auto-install on missing dependencies
  • version checking
  • OS-specific installation commands
  • Clear error messages with troubleshooting steps

Use this when setting up a new environment with python scripts/dependencies.py --check --install

Advanced Features

Custom Configuration

Create config.yaml for persistent settings:

default_engine: auto
confidence_threshold: 0.8
output_format: json
preprocess:
  denoise: true
  enhance_contrast: true

Batch Processing

Process multiple images:

python scripts/ocr.py --images ./images/*.png --output ./results

API Mode

Use as a Python library:

from super_ocr import OCRProcessor

processor = OCRProcessor(engine='auto')
result = processor.extract('image.png')
print(result.text)
print(result.confidence)

Anti-Patterns

  • ❌ Using PaddleOCR for every image (overhead for simple cases)
  • ❌ ignoring confidence scores (quality matters)
  • ❌ Biases (always prefering one engine)
  • ❌ Skipping preprocessing (quality impact)

Performance Notes

EngineInit TimePer-ImageMemoryBest For
--------------------------------------------------
Tesseract~200ms~50ms~100MBQuick extraction
PaddleOCR~3s~500ms~500MBHigh accuracy

Initialize once, reuse processor for batch processing.

版本历史

共 1 个版本

  • v0.1.0 当前
    2026-03-30 10:34 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

productivity

Word / DOCX

ivangdavila
创建、检查和编辑 Microsoft Word 文档及 DOCX 文件,支持样式、编号、修订记录、表格、分节符及兼容性检查等功能。
★ 438 📥 147,628
productivity

Nano Pdf

steipete
使用nano-pdf CLI通过自然语言指令编辑PDF
★ 275 📥 114,833
productivity

Weather

steipete
获取当前天气和预报(无需API密钥)
★ 445 📥 226,279