← 返回
内容创作

Tesseract Ocr

Extract text from images using the Tesseract OCR engine directly via command line. Supports multiple languages including Chinese, English, and more. Use this...
利用Tesseract OCR引擎直接通过命令行从图像中提取文本,支持中文、英文等多种语言。
whalefell
内容创作 clawhub v1.0.0 1 版本 98617.7 Key: 无需
★ 2
Stars
📥 6,595
下载
💾 1,351
安装
1
版本
#latest

概述

Tesseract OCR Skill

Extract text content from images using the Tesseract engine directly via command line.

Features

  • Extract text from image files using native tesseract CLI
  • Support multi-language recognition (Chinese, English, etc.)
  • No Python dependencies required
  • Simple and fast

Dependencies

Install Tesseract OCR system package:

# Ubuntu/Debian:
sudo apt-get install tesseract-ocr tesseract-ocr-chi-sim

# macOS:
brew install tesseract tesseract-lang

Usage

Basic Usage

# Use default language (English)
tesseract /path/to/image.png stdout

# Specify language (Chinese + English)
tesseract /path/to/image.png stdout -l chi_sim+eng

# Save to file
tesseract /path/to/image.png output.txt -l chi_sim+eng

# Multiple languages
tesseract /path/to/image.png stdout -l chi_sim+eng+jpn

Common Language Codes

LanguageCode
----------------
Simplified Chinesechi_sim
Traditional Chinesechi_tra
Englisheng
Japanesejpn
Koreankor
Chinese + Englishchi_sim+eng

Quick Examples

# OCR with Chinese support
tesseract image.jpg stdout -l chi_sim

# OCR with mixed Chinese and English
tesseract image.png stdout -l chi_sim+eng

# Save to file instead of stdout
tesseract document.png result -l chi_sim+eng
# Creates result.txt

Notes

  1. OCR accuracy depends on image quality; use clear images for best results
  2. Complex layouts (tables, multi-column) may require post-processing
  3. Chinese recognition requires the tesseract-ocr-chi-sim language pack
  4. Language packs must be installed separately on your system

版本历史

共 1 个版本

  • v1.0.0 当前
    2026-03-28 12:55 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

content-creation

YouTube

byungkyu
使用托管OAuth集成YouTube Data API,支持搜索视频、管理播放列表、获取频道数据及评论互动,适用于用户需要时使用此技能。
★ 141 📥 41,013
content-creation

Humanizer

biostartechnology
消除AI写作痕迹,使文本更自然真实。基于维基百科"AI写作特征"指南,识别并修正夸张象征、宣传用语、肤浅-ing分析、模糊归因、破折号滥用、三项排比、AI词汇、负面平行结构及冗长连接词等模式。
★ 857 📥 199,243
content-creation

AdMapix

fly0pants
广告情报与应用数据分析助手,支持搜索广告素材、分析应用排名、下载量、收入及市场洞察,用于广告素材和竞品分析。
★ 294 📥 136,396