← 返回
未分类 Key 中文

ReceiptExtract - OCR, Photo/PDF to CSV

Extract structured transaction data from image or PDF receipts using the ReceiptExtract API (https://www.receiptextract.com). Use when the user wants merchan...
使用 ReceiptExtract API(https://www.receiptextract.com)从图片或 PDF 收据中提取结构化交易数据,适用于用户需要商户...
yborunov
未分类 clawhub v1.0.1 1 版本 100000 Key: 需要
★ 0
Stars
📥 357
下载
💾 0
安装
1
版本
#latest

概述

Receipts

Extract transaction data from receipt images or PDFs with ReceiptExtract.

Keep the workflow simple: locate the API token, upload one receipt file (or a directory for bulk mode), inspect the JSON, then present either raw JSON or a cleaned summary. Prefer the bundled helper script for repeatable usage.

Quick workflow

  1. Identify the input file
    • Accept common image formats (.jpg, .jpeg, .png, .webp) and PDFs.
    • If the file came from chat, use the attached local path.
  1. Locate the API token
    • Set RECEIPTEXTRACT_API_TOKEN in your environment before running commands.
    • Do not paste the token back into chat.
  1. Call the upload endpoint
    • Endpoint: POST https://www.receiptextract.com/api/receipt/upload
    • Auth header: Authorization: Bearer
    • Multipart form field: file
  1. Parse the response
    • Success shape typically includes:
    • success
    • data.merchant
    • data.date
    • data.items[]
    • data.tax
    • data.total
    • data.correctnessCheck
    • data.taxBreakdown[]
    • creditInfo
    • savedReceiptId
  1. Present the result
    • For humans: summarize merchant, date, items, tax, total, and any anomalies.
    • For integrations: return raw JSON or convert to CSV.

Preferred command

Use the helper script:

export RECEIPTEXTRACT_API_TOKEN="your-token"
python3 scripts/extract_receipt.py /path/to/receipt.png

Optional flags:

python3 scripts/extract_receipt.py /path/to/receipt.pdf --format summary
python3 scripts/extract_receipt.py /path/to/receipt.jpg --format csv
python3 scripts/extract_receipt.py --input-dir /path/to/receipts --format summary
python3 scripts/extract_receipt.py --input-dir /path/to/receipts --recursive --format json

Bulk processing

Use --input-dir to process multiple receipts in one run. The helper script sends one API request per file and continues even if some files fail.

  • Supported file types: .jpg, .jpeg, .png, .webp, .pdf
  • Use --recursive to include nested folders
  • Exit code is non-zero when one or more files fail
  • Each receipt consumes credits independently

Fallback command

Use curl when the helper script is unnecessary:

curl -sS -X POST "https://www.receiptextract.com/api/receipt/upload" \
  -H "Authorization: Bearer $RECEIPTEXTRACT_API_TOKEN" \
  -F "file=@/path/to/receipt.png"

Output handling

JSON

Prefer JSON when the user wants the full extracted payload or when another tool will consume the result. In bulk mode, JSON includes processed, succeeded, failed, and per-file results.

Summary

In bulk mode, summary prints one status line per file followed by total counts.

Use a concise format like:

Merchant: Walmart
Date: 2023-06-09
Total: 76.37
Tax: 8.18
Items:
- BEDDING — 39.97
- STEAMER — 27.97

CSV

When the user asks for CSV, output line-item rows with these columns when available:

  • source_file (bulk mode)
  • merchant
  • date
  • description
  • quantity
  • total_price
  • item_tax
  • sku
  • receipt_tax
  • receipt_total
  • saved_receipt_id
  • http_status (bulk mode)
  • success (bulk mode)
  • error (bulk mode)

Error handling

Interpret common failures like this:

  • 400 — invalid input, missing file, unsupported type, or file too large
  • 401 — missing/invalid token
  • 402 — insufficient credits
  • 429 — rate limited; retry with backoff
  • 500 — server error; safe to retry carefully

If the response is malformed or success is false:

  • show the error plainly
  • do not invent extracted data
  • mention likely causes if obvious (bad token, no credits, unsupported file)

Practical notes

  • Treat the API result as the source of truth, but sanity-check obvious issues.
  • Flag suspicious output instead of silently "fixing" it.
  • Example: Canadian receipt with tax currency labeled USD.
  • correctnessCheck: true is a useful confidence signal, not a guarantee.
  • Preserve the original file path and savedReceiptId when useful for traceability.
  • In bulk mode, keep one request per file and preserve each source file path for traceability.

Security

  • Keep the token out of chat replies.
  • Prefer environment variables or secret managers over embedding tokens in scripts.
  • Do not commit tokens, raw headers, or secret-bearing examples into git.

Resources

  • Helper script: scripts/extract_receipt.py
  • API reference notes: references/api.md

版本历史

共 1 个版本

  • v1.0.1 当前
    2026-05-07 08:30 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

ai-intelligence

Self-Improving + Proactive Agent

ivangdavila
自我反思+自我批评+自我学习+自组织记忆。智能体评估自身工作、发现错误并持续改进。
★ 1,366 📥 319,482
developer-tools

Github

steipete
使用 `gh` CLI 与 GitHub 交互,通过 `gh issue`、`gh pr`、`gh run` 和 `gh api` 管理议题、PR、CI 运行及高级查询。
★ 672 📥 324,828
security-compliance

Skill Vetter

spclaudehome
AI智能体技能安全预审工具。安装ClawdHub、GitHub等来源技能前,检查风险信号、权限范围及可疑模式。
★ 1,221 📥 267,204