← 返回
未分类 中文

PDF Read/Write Toolkit

Read, extract, and generate PDF files. Use when user asks to read PDF content, extract text/tables, merge PDFs, fill forms, or generate PDFs from HTML/Markdown.
读取、提取并生成 PDF 文件。适用于用户要求读取 PDF 内容、提取文本/表格、合并 PDF、填写表单,或从 HTML/Markdown 生成 PDF。
droba07 droba07 来源
未分类 clawhub v1.0.0 1 版本 100000 Key: 无需
★ 0
Stars
📥 401
下载
💾 0
安装
1
版本
#latest

概述

PDF Skill

Read, extract, analyze, and generate PDF documents.

Capabilities

  • Extract text from PDF (full or per-page)
  • Extract tables from PDF as CSV/JSON
  • Get metadata (title, author, pages, etc.)
  • Merge multiple PDFs into one
  • Split PDF by page ranges
  • Generate PDF from HTML or Markdown
  • Fill PDF forms

Scripts

All scripts are in scripts/ relative to this skill directory.

Read / Extract

# Extract all text
python3 scripts/pdf_read.py <file.pdf>

# Extract text from specific pages (1-indexed)
python3 scripts/pdf_read.py <file.pdf> --pages 1,3,5-10

# Extract tables as CSV
python3 scripts/pdf_read.py <file.pdf> --tables --format csv

# Extract tables as JSON
python3 scripts/pdf_read.py <file.pdf> --tables --format json

# Get PDF metadata and page count
python3 scripts/pdf_read.py <file.pdf> --info

Merge / Split

# Merge multiple PDFs
python3 scripts/pdf_merge.py output.pdf input1.pdf input2.pdf input3.pdf

# Split: extract specific pages
python3 scripts/pdf_split.py input.pdf output.pdf --pages 1,3,5-10

Generate

# Generate PDF from HTML file
python3 scripts/pdf_generate.py input.html output.pdf

# Generate PDF from HTML string
python3 scripts/pdf_generate.py --html "<h1>Hello</h1><p>World</p>" output.pdf

# Generate PDF from Markdown (converted to HTML first)
python3 scripts/pdf_generate.py input.md output.pdf

Usage Notes

  • For large PDFs, use --pages to limit extraction scope
  • Table extraction works best with well-structured tables; complex layouts may need manual cleanup
  • PDF generation via WeasyPrint supports CSS styling — pass a --css file for custom styles
  • All paths can be absolute or relative to the workspace

版本历史

共 1 个版本

  • v1.0.0 当前
    2026-05-07 09:06 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

office-efficiency

Excel / XLSX

ivangdavila
创建、检查和编辑 Microsoft Excel 工作簿及 XLSX 文件,支持可靠的公式、日期、类型、格式、重算及模板保留功能。
★ 381 📥 144,088
office-efficiency

Nano Pdf

steipete
使用nano-pdf CLI通过自然语言指令编辑PDF
★ 277 📥 115,879
office-efficiency

Gog

steipete
Google Workspace 命令行工具,支持 Gmail、日历、云端硬盘、通讯录、表格和文档。
★ 926 📥 186,727