← 返回
效率工具 中文

pdf2ofd

Converts PDF documents (invoices, reports) to High-Fidelity OFD format with pixel-perfect precision.
将PDF文档(发票、报告)精确转换为高保真OFD格式。
xzw
效率工具 clawhub v1.0.2 1 版本 100000 Key: 无需
★ 0
Stars
📥 510
下载
💾 22
安装
1
版本
#latest

概述

PDF to OFD High-Fidelity Converter

🎯 Purpose

A specialized skill for converting PDF documents into the Chinese National Standard OFD (GB/T 33190-2016) format. Optimized for Electronic Invoices (OFD版式发票) with advanced rendering capabilities that exceed standard conversion libraries.

✨ Key Features

  • High-Fidelity Text Placement: Uses character-level positioning (DeltaX arrays) and baseline origin data extracted via rawdict to ensure text layout is 100% identical to the source PDF.
  • Advanced Vector Graphics: Directly extracts original stroke colors, fill colors, and line widths. Supports complex path types and fill instructions.
  • Transparency Preservation: Fully supports Alpha and FillOpacity for vector paths and SMask transparency for images (e.g., electronic seals and signatures).
  • Cross-Platform Font Mapping: Intelligent mapping of macOS-specific (STSong, STKaiti) and Windows-specific font names to standardized OFD font names (宋体, 楷体, 黑体).
  • In-Memory Packaging: Generates the final OFD zip structure entirely in memory to avoid temporary file clutter and ensure security.
  • Color Snapping: Heuristic "Invoice Red" correction (128 0 0) for financial documents while preserving non-standard colors.

🛠️ Usage Instructions

When a user asks to convert a PDF or a "High-Fidelity" invoice to OFD:

  1. Direct Execution:

```bash

python3 pdf2ofd.py [output_path.ofd]

```

  1. Plugin Integration:

The script implements a PDF2OFDConverter class that can be easily imported and used in other Python workflows.

Example Output

Success: /path/to/invoice.ofd

📦 Requirements

Dependencies required in the environment:

  • PyMuPDF (fitz): For advanced PDF parsing and raw character data extraction.
  • Pillow: For image processing and transparency handling.
  • easyofd: The base library for OFD structure (extended via internal monkey patches).
  • xmltodict: For XML manipulation.

💡 Notes

  • This skill uses deep monkey-patching on easyofd to fix known library limitations regarding character positioning and resource ID tracking.
  • The conversion process assumes standard Chinese fonts (SimSun, KaiTi, SimHei) are available on the viewing system.
  • Zero-copy resource handling: Images are extracted and re-compressed as PNG/JPG only when necessary to preserve quality.

版本历史

共 1 个版本

  • v1.0.2 当前
    2026-03-30 03:30 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

productivity

Nano Pdf

steipete
使用nano-pdf CLI通过自然语言指令编辑PDF
★ 275 📥 114,836
productivity

Word / DOCX

ivangdavila
创建、检查和编辑 Microsoft Word 文档及 DOCX 文件,支持样式、编号、修订记录、表格、分节符及兼容性检查等功能。
★ 438 📥 147,647
productivity

Obsidian

steipete
操作 Obsidian 仓库(纯 Markdown 笔记)并通过 obsidian-cli 自动化。
★ 432 📥 103,783