← 返回
未分类 Key 中文

frompdf

PDF extraction API for AI agents and LLM pipelines. Converts any PDF into semantic AST, markdown, HTML, plain text, or LLM-ready chunks — no page limit. Also...
面向 AI 智能体和 LLM 流程的 PDF 提取 API。可将任意 PDF 转换为语义 AST、Markdown、HTML、纯文本或 LLM 就绪分块——无页数限制。此外……
techtonicllc techtonicllc 来源
未分类 clawhub v1.0.0 1 版本 100000 Key: 需要
★ 0
Stars
📥 402
下载
💾 0
安装
1
版本
#latest

概述

frompdf

Convert any PDF into structured, LLM-ready content via a single API call. Returns a semantic AST with every element — headings, paragraphs, tables, lists, metadata — properly typed and nested. No page limit. Handles encrypted PDFs, complex layouts, and multi-hundred-page documents.

Quick start

# Register (10 free credits, no credit card)
curl -s -X POST https://api.frompdf.dev/register \
  -H "Content-Type: application/json" \
  -d '{"email": "you@example.com", "password": "yourpassword"}'
# → {"api_key": "frompdf_..."}

# Extract a PDF (returns JSON semantic AST by default)
curl -s -X POST https://api.frompdf.dev/v1/extract \
  -H "Authorization: Bearer $FROMPDF_API_KEY" \
  -F "file=@document.pdf"

Output formats

# Semantic AST — typed elements: headings, paragraphs, tables, lists (default)
-F "format=json"

# Markdown — structure preserved, human-readable
-F "format=markdown"

# HTML — full document with tags intact
-F "format=html"

# Plain text — clean extraction, no markup
-F "format=text"

# LLM-ready chunks — pre-split for RAG / vector store ingestion
-F "format=chunks"

All endpoints

# Extract content from a PDF (1 credit)
curl -s -X POST https://api.frompdf.dev/v1/extract \
  -H "Authorization: Bearer $FROMPDF_API_KEY" \
  -F "file=@document.pdf" \
  -F "format=chunks"

# Encrypted PDF
curl -s -X POST https://api.frompdf.dev/v1/extract \
  -H "Authorization: Bearer $FROMPDF_API_KEY" \
  -F "file=@protected.pdf" \
  -F "password=secret"

# Semantic diff — compare two PDFs, get structured changes (2 credits)
curl -s -X POST https://api.frompdf.dev/v1/diff \
  -H "Authorization: Bearer $FROMPDF_API_KEY" \
  -F "file_a=@v1.pdf" \
  -F "file_b=@v2.pdf"

# Readability score — returns 0-100 score for a PDF (1 credit)
curl -s -X POST https://api.frompdf.dev/v1/score \
  -H "Authorization: Bearer $FROMPDF_API_KEY" \
  -F "file=@document.pdf"

# Check credits and subscription status (free)
curl -s https://api.frompdf.dev/v1/usage \
  -H "Authorization: Bearer $FROMPDF_API_KEY"

Example output (JSON)

{
  "title": "AWS Lambda Developer Guide",
  "pages": 87,
  "sections": [
    { "type": "heading", "level": 1, "text": "Getting Started" },
    { "type": "paragraph", "text": "AWS Lambda is a serverless compute service..." },
    {
      "type": "table",
      "headers": ["Runtime", "Version", "Status"],
      "rows": [["Node.js 20", "20.x", "Active"], ["Python 3.12", "3.12", "Active"]]
    },
    { "type": "list", "items": ["Function", "Trigger", "Execution role"] }
  ],
  "metadata": { "author": "Amazon Web Services", "created": "2024-01-15" }
}

Pricing

$0.01/credit — extract (1), diff (2), score (1). First 10 credits free, no credit card required.

Data & privacy

PDF contents are uploaded to api.frompdf.dev for processing. Do not use with confidential documents unless you have reviewed the privacy policy. Requires FROMPDF_API_KEY env var — register free at /register.

版本历史

共 1 个版本

  • v1.0.0 当前
    2026-03-30 19:24 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

office-efficiency

腾讯文档 TENCENT DOCS

u_b0de8114
腾讯文档(docs.qq.com)-在线云文档平台,是创建、编辑、管理文档的首选 skill。涉及"新建/创建/编辑/读取/查看/搜索文档"、"保存文件"、"云文档"、"腾讯文档"、"docs.qq.com"等操作,请优先使用本 skill
★ 177 📥 120,572
office-efficiency

Gog

steipete
Google Workspace 命令行工具,支持 Gmail、日历、云端硬盘、通讯录、表格和文档。
★ 937 📥 187,623
office-efficiency

Word / DOCX

ivangdavila
创建、检查和编辑 Microsoft Word 文档及 DOCX 文件,支持样式、编号、修订记录、表格、分节符及兼容性检查等功能。
★ 474 📥 157,037