← 返回
安全合规 中文

File to Markdown Converter

Convert documents, spreadsheets, images, and structured files into clean, structured Markdown optimized for AI processing without authentication.
将文档、电子表格、图片和结构化文件转化为适合 AI 处理的整洁 Markdown,无需身份验证。
alaminrifat
安全合规 clawhub v1.0.0 1 版本 100000 Key: 无需
★ 0
Stars
📥 1,160
下载
💾 24
安装
1
版本
#documents#file-conversion#latest#markdown#pdf

概述

File to Markdown — Skill

Overview

Convert files into clean, structured, AI-ready Markdown using the markdown.new API powered by Cloudflare Workers AI toMarkdown().

Supports 20+ formats including documents, spreadsheets, images, and structured data.

No authentication required (500 requests/day per IP).


When to Use This Skill

Use this skill whenever you need to:

  • Extract text from files for LLM processing
  • Convert PDFs or Office files into Markdown
  • Normalize data into structured text
  • Process uploaded user files
  • Scrape webpage content into Markdown
  • Convert images into AI-generated descriptions + content

Common AI workflows:

  • RAG ingestion pipelines
  • Knowledge base creation
  • Document summarization
  • Dataset extraction
  • Spreadsheet analysis
  • OCR-like extraction from images

Supported Formats

Documents

  • .pdf
  • .docx
  • .odt

Spreadsheets

  • .xlsx
  • .xls
  • .xlsm
  • .xlsb
  • .et
  • .ods
  • .numbers

Images

  • .jpg
  • .jpeg
  • .png
  • .webp
  • .svg

Text & Structured Data

  • .txt
  • .md
  • .csv
  • .json
  • .xml
  • .html
  • .htm

Notes:

  • Image conversion uses AI object detection + summarization.
  • HTML URL conversion uses a web page pipeline.
  • Uploaded HTML uses Workers AI conversion.

API Base URL

https://markdown.new

Endpoints

1️⃣ Convert Remote File (Simple GET)

Returns plain Markdown text.

GET /:file-url

Example:

curl -s "https://markdown.new/https://example.com/report.pdf"

2️⃣ Convert Remote File (JSON Response)

Returns metadata + Markdown.

GET /:file-url?format=json

Example:

curl -s "https://markdown.new/https://example.com/report.pdf?format=json"

3️⃣ Convert Remote File via POST

Use when you want structured JSON response.

POST /
Content-Type: application/json

Body:

{
  "url": "https://example.com/report.pdf"
}

Example:

curl -s https://markdown.new/ \
  -H "Content-Type: application/json" \
  -d '{"url": "https://example.com/report.pdf"}'

4️⃣ Upload Local File

Use when file is not publicly accessible.

POST /convert
multipart/form-data

Example:

curl -s https://markdown.new/convert \
  -F "file=@document.pdf"

Response Formats

URL Conversion Response

{
  "success": true,
  "url": "https://example.com/report.pdf",
  "title": "Quarterly Report",
  "content": "# Quarterly Report\n\n...",
  "method": "Workers AI (file)",
  "duration_ms": 1200,
  "tokens": 850
}

Upload Conversion Response

{
  "success": true,
  "data": {
    "title": "Q4 Report",
    "content": "# Q4 Report\n\n...",
    "filename": "report.xlsx",
    "file_type": ".xlsx",
    "tokens": 1250,
    "processing_time_ms": 320
  }
}

Best Practices for AI Agents

Prefer GET for Simple Workflows

Use:

GET /:url

When:

  • You only need Markdown text
  • Speed is important
  • No metadata required

Prefer POST for Structured Pipelines

Use POST when:

  • Metadata is needed
  • Token counts are required
  • Monitoring or logging is implemented
  • Building automation workflows

File Upload Strategy

Use /convert only if:

  • File is local
  • File is private
  • File requires authentication to access

Otherwise always prefer URL conversion.


Error Handling Strategy

Agents should:

  1. Check "success": true
  2. Retry once if network failure
  3. Validate content length > 0
  4. Fallback to alternate extraction if needed

Rate Limits

  • 500 requests/day per IP without API key
  • No signup required

Agents should:

  • Cache results when possible
  • Avoid duplicate conversions

Integration Examples

JavaScript (Node.js)

const res = await fetch("https://markdown.new/", {
  method: "POST",
  headers: { "Content-Type": "application/json" },
  body: JSON.stringify({
    url: "https://example.com/file.pdf"
  })
});

const data = await res.json();
console.log(data.content);

Python

import requests

res = requests.post(
    "https://markdown.new/",
    json={"url": "https://example.com/file.pdf"}
)

data = res.json()
print(data["content"])

Agent Decision Tree

If user provides:

| Input Type | Action |

| --------------- | ---------------------- |

| Public file URL | Use GET or POST |

| Local file | Use POST /convert |

| Image | Convert then summarize |

| Spreadsheet | Convert then analyze |

| Webpage | Convert URL HTML |


Output Expectations

The Markdown should be:

  • Clean
  • Structured
  • AI-friendly
  • Minimal noise
  • Ready for LLM ingestion

Limitations

  • Complex PDF layouts may lose formatting
  • Large spreadsheets may be truncated
  • Images rely on AI interpretation accuracy
  • Token limits may apply

Summary

This skill provides a universal file-to-Markdown conversion layer for AI systems with:

  • No authentication
  • Simple HTTP interface
  • Multi-format support
  • Structured output
  • Fast processing

Ideal for document ingestion, RAG pipelines, and automation agents.


版本历史

共 1 个版本

  • v1.0.0 当前
    2026-03-29 12:24 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

security-compliance

Skill Vetter

spclaudehome
AI智能体技能安全预审工具。安装ClawdHub、GitHub等来源技能前,检查风险信号、权限范围及可疑模式。
★ 1,213 📥 266,390
security-compliance

1password

steipete
设置和使用 1Password CLI (op)。适用于:安装 CLI、启用桌面应用集成、登录(单/多账户)、通过 op 读取/注入/运行密钥。
★ 53 📥 31,163
security-compliance

OpenClaw Backup

alex3alex
备份与恢复 OpenClaw 数据。适用于创建备份、设置自动备份计划、从备份恢复或管理备份轮转。处理 ~/.openclaw 目录归档并包含适当的排除规则。
★ 89 📥 30,600