概述

Document Extraction

This skill allows users to extract and recognize text from documents, including PDF and DOCX files, using an external GITEE AI API.

Usage

Ensure you have installed the required dependencies (pip install requests requests-toolbelt). Use the bundled script to perform document extraction.

python {baseDir}/scripts/perform_doc_extraction.py --file /path/to/document.pdf --api-key YOUR_API

Options

No additional parameters are required for this skill.

Workflow

Execute the perform_doc_extraction.py script with the parameters from the user.
Parse the script output and find the line starting with EXTRACTION_RESULT:.
Extract the OCR result from that line (format: EXTRACTION_RESULT: ...).
Display the OCR result to the user using markdown syntax: 📖[EXTRACTION_RESULT Result].

Notes

If GITEEAI_API_KEY is none, you should remind user to provide --api-key argument
Please handle the output of the script carefully, ensuring that you only extract and display the relevant information without adding any extra commentary or interpretation.
You should optimize the output format to make it more concise and user-friendly, but do not change or ignore the content of the result.
The script prints EXTRACTION_RESULT: in the output - extract this result and display it using markdown image syntax:📖[EXTRACTION_RESULT Result].
Always look for the line starting with EXTRACTION_RESULT: in the script output.

版本历史

共 1 个版本

v1.0.0 当前

2026-03-31 08:06 安全安全

安全检测

腾讯云安全 (Keen)

安全，无风险

查看报告

腾讯云安全 (Sanbu)

安全，无风险

查看报告

Moark Doc Extraction

概述

Document Extraction

Usage

Options

Workflow

Notes

版本历史

安全检测

腾讯云安全 (Keen)

腾讯云安全 (Sanbu)

🔗 相关推荐

Moark Image Gen

Moark Web Search

Moark Ocr