← 返回
内容创作 Key 中文

Upstage Document Parse

Parse documents (PDF, images, DOCX, PPTX, XLSX, HWP) into layout-aware markdown/HTML with tables, figures, headings, and bounding boxes using Upstage Documen...
使用Upstage文档解析器将PDF、图片、DOCX、PPTX、XLSX、HWP等文档解析为支持布局的Markdown/HTML,包含表格、图形、标题和边界框。
upstage-deployment
内容创作 clawhub v1.0.5 2 版本 99412.5 Key: 需要
★ 4
Stars
📥 3,812
下载
💾 678
安装
2
版本
#latest

概述

Upstage Document Parse

Convert documents into structured HTML/Markdown. Recognizes layout elements such as tables, images, equations, and charts with bounding box coordinates.

Quick Start

import os
import requests

with open("report.pdf", "rb") as f:
    response = requests.post(
        "https://api.upstage.ai/v1/document-digitization",
        headers={"Authorization": f"Bearer {os.environ['UPSTAGE_API_KEY']}"},
        files={"document": f},
        data={"model": "document-parse", "output_formats": "['markdown']"}
    )
print(response.json()["content"]["markdown"])

API Key: Always use os.environ["UPSTAGE_API_KEY"]. Get your key at console.upstage.ai.

Supported Formats

JPEG, PNG, BMP, PDF (up to 1000 pages with async), TIFF, HEIC, DOCX, PPTX, XLSX, HWP, HWPX

Sync vs Async

ModeEndpointMax pagesMax file sizeNotes
-------------------------------------------------
Sync/v1/document-digitization10050 MBResult returned in response (5 min server timeout). Best for ≤ 100 pages and quick turnaround.
Async/v1/document-digitization/async100050 MBReturns request_id; processed in 10-page batches. Use when document exceeds sync limits or sync would time out.

Decision rule:

  • ≤ 100 pages and expected to finish within 5 min → sync.
  • 100 pages, scanned/complex content, or batch jobs → async.

For async submit/poll workflow, see references/async-workflow.md.

Key Parameters (Sync)

ParameterDefaultCommon Values
-----------------------------------
modelrequireddocument-parse
output_formats['html']['markdown'], ['html', 'markdown']
modestandardenhanced (complex tables), auto
ocrautoforce (always OCR scanned PDFs)
coordinatestruefalse to omit bounding boxes

For full parameter reference and curl variations (enhanced mode, force OCR, base64 table images, LangChain integration), see references/sync-options.md.

Response Structure

{
  "api": "2.0",
  "model": "document-parse-251217",
  "content": {
    "html": "<h1>...</h1>",
    "markdown": "# ...",
    "text": "..."
  },
  "elements": [
    {
      "id": 0,
      "category": "heading1",
      "content": { "html": "...", "markdown": "...", "text": "..." },
      "page": 1,
      "coordinates": [{"x": 0.06, "y": 0.05}, ...]
    }
  ],
  "usage": { "pages": 1 }
}

Element Categories

paragraph, heading1, heading2, heading3, list, table, figure, chart, equation, caption, header, footer, index, footnote

Output Files

  • Default: write to /.parsed. where matches output_formats (md or html). Example: /tmp/report.parsed.md. Use tempfile.gettempdir() for cross-platform code.
  • Override: if the user specifies an output path, use it.
  • Always print the resolved absolute path in your response so the user can locate the file.

Tips

  • Use mode=enhanced for complex tables, charts, images
  • Use mode=auto to let API decide per page
  • Use async API for documents > 100 pages, > 50 MB, or when sync would exceed the 5-min timeout (async caps at 1000 pages)
  • Use ocr=force for scanned PDFs or images
  • merge_multipage_tables=true combines split tables (max 20 pages with enhanced mode)
  • Standard documents process in ~3 seconds; sync API timeout is 5 minutes

Detailed References

FileContent
---------------
references/sync-options.mdFull sync parameter reference, mode selection, curl variations, LangChain
references/async-workflow.mdAsync submit/poll/status, Python polling pattern, retention rules

版本历史

共 2 个版本

  • v1.0.5 当前
    2026-05-08 12:07 安全 安全
  • v1.0.4
    2026-03-28 12:34 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

ai-intelligence

Botmadang

upstage-deployment
与 BotMadang(botmadang.org)互动,一个面向 AI 代理的韩语社区平台。发布文章、发表评论、点赞/踩、查看通知。
★ 3 📥 1,913
content-creation

Baidu Wenku AIPPT

ide-rea
使用百度文库 AI 智能生成 PPT,自动根据内容选择模板。
★ 66 📥 46,131
content-creation

Humanizer

biostartechnology
消除AI写作痕迹,使文本更自然真实。基于维基百科"AI写作特征"指南,识别并修正夸张象征、宣传用语、肤浅-ing分析、模糊归因、破折号滥用、三项排比、AI词汇、负面平行结构及冗长连接词等模式。
★ 857 📥 199,258