← 返回
内容创作 Key 中文

Doc Ocr Skills

OCR documents (PDFs and images) using Gemini 2.5 Flash, PaddleOCR (local), or RapidOCR (local).
使用Gemini 2.5 Flash、PaddleOCR(本地)或RapidOCR(本地)对PDF和图片进行文档OCR。
scottkiss
内容创作 clawhub v0.1.0 1 版本 100000 Key: 需要
★ 0
Stars
📥 1,152
下载
💾 153
安装
1
版本
#latest

概述

Document OCR Skill (docr)

Uses Gemini 2.5 Flash, PaddleOCR, or RapidOCR (local) to recognize text from scanned PDFs and images. Compiled as a single Go binary.

Prerequisites

  • API Key configured in ~/.ocr/config (not needed for Paddle/Rapid)
  • For RapidOCR engine: pip install rapidocr_onnxruntime
  • For PaddleOCR engine: pip install paddleocr paddlepaddle

API Key Configuration

Create the config file:

mkdir -p ~/.ocr
cat > ~/.ocr/config << EOF
# Google Gemini API Key
gemini_api_key=your_gemini_key
EOF

Quick Start

> Path Variable: All commands below use $DOCR. Before running any command, set this variable:

> ```bash

> SKILL_DIR="$(cd "$(dirname "")" && pwd)"

> DOCR="$SKILL_DIR/scripts/docr/docr"

> ```

# OCR a single document using RapidOCR (default)
$DOCR document.pdf
$DOCR image.jpg

# Use Gemini engine
$DOCR -engine gemini document.pdf

# Use PaddleOCR local engine
$DOCR -engine paddle document.pdf

# Specify output file
$DOCR document.pdf -o result.txt

# Batch process all supported files in a directory
$DOCR -batch ./docs/ -o ./outputs/

Engines

EngineFlagAPI Key ConfigDoc Handling
-------------------------------------------
RapidOCR (default)-engine rapidNoneLocal OCR
Gemini-engine geminigemini_api_keyCloud Vision API
PaddleOCR (local)-engine paddleNoneLocal OCR

CLI Reference

docr [options] <file or directory>

Options:
  -engine string   OCR engine: rapid (default) / gemini / paddle
  -e string        Engine (short flag)
  -o string        Output file path or directory (batch mode)
  -output string   Output path (long flag)
  -batch           Batch mode: process all files in directory
  -prompt string   Custom recognition prompt (gemini)

Installation

We provide pre-compiled binaries to get you started quickly.

cd doc-ocr-skills/scripts
./install.sh

This script will detect your OS (darwin/linux) and architecture (amd64/arm64) and download the appropriate version of docr.

Building from Source (Optional)

If you prefer to build from source, ensure you have Go 1.21+ installed:

cd doc-ocr-skills/scripts/docr
go build -o docr .

Error Handling

ErrorSolution
-----------------
config file not foundCreate ~/.ocr/config with API keys
gemini_api_key not foundAdd gemini_api_key=VALUE to config
file not foundVerify the document file path
API timeoutRetry; large files may need longer

版本历史

共 1 个版本

  • v0.1.0 当前
    2026-03-31 03:39 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

content-creation

AdMapix

fly0pants
广告情报与应用数据分析助手,支持搜索广告素材、分析应用排名、下载量、收入及市场洞察,用于广告素材和竞品分析。
★ 295 📥 136,413

Pdf2word Skills

scottkiss
使用免费本地OCR引擎或远程API将扫描PDF转换为Word文档
★ 0 📥 620
content-creation

Humanizer

biostartechnology
消除AI写作痕迹,使文本更自然真实。基于维基百科"AI写作特征"指南,识别并修正夸张象征、宣传用语、肤浅-ing分析、模糊归因、破折号滥用、三项排比、AI词汇、负面平行结构及冗长连接词等模式。
★ 857 📥 199,306