← 返回
效率工具 中文

large-document-reader

Intelligently splits long academic or technical documents into chapters, generates structured JSON summaries for each, and creates a file system with a globa...
智能拆分长篇学术或技术文档,生成结构化章节摘要,并构建全局文件系统。
mrchenkuan
效率工具 clawhub v1.0.2 1 版本 100000 Key: 无需
★ 0
Stars
📥 914
下载
💾 59
安装
1
版本
#latest

概述

Literature Structuring Expert

Automatically decompose long documents (papers, reports, books) into a structured, AI-friendly knowledge base. Splits by chapter, generates machine-readable summaries, and builds a navigable index to overcome context limits.

When to Use This Skill

Use this skill when the user:

  • Has a document that is too long for the AI's context window.
  • Needs to perform cross-chapter analysis or get a high-level overview of a long text.
  • Wants to build a reusable, queryable knowledge base from a PDF, Markdown, or text file.
  • Asks: "How can I get my AI to read this whole book/paper?"

Quick Reference

SituationAction
-------------------
User provides a long document1. Analyze and split it into chapters.
2. Generate a JSON summary for each chapter.
3. Create a master index file.
User asks a high-level, cross-chapter questionProvide the content of the MASTER_INDEX.md file to the AI.
User asks a detailed, chapter-specific questionProvide the corresponding single file from the ./chapters/ directory to the AI.
Task completedPresent the generated file tree and MASTER_INDEX.md preview to the user.

Core Workflow

Phase 1: Intelligent Splitting

  1. Analyze Input: Receive the long document text or file path.
  2. Identify Structure: Automatically analyze the document to identify heading hierarchies (e.g., #, ##, 1., 1.1) to determine chapter boundaries. Prioritize user-specified splitting preferences.
  3. Execute Split: Split the document into independent plain-text files by chapter.
    • Naming Convention: {sequence_number}_{chapter_title}.md (e.g., 01_Introduction.md).
    • Storage Location: All chapter files are saved in the ./chapters/ directory.

Phase 2: Summary Generation & Structuring

  1. Generate Summary per Chapter: For each file in ./chapters/, generate a corresponding JSON summary file.
    • Structured Fields (JSON format):

```json

{

"chapter_id": "Unique identifier matching the filename, e.g., 02_1",

"chapter_title": "Chapter Title",

"abstract": "Core summary of the chapter, 200-300 words.",

"keywords": ["Keyword1", "Keyword2", "Keyword3"],

"key_points": ["Key point one", "Key point two"],

"related_sections": ["IDs of other chapters strongly related to this one"]

}

```

  • Storage Location: JSON summary files are saved in the ./summaries/ directory (e.g., 01_Introduction.summary.json).

Phase 3: Create Global Index

  1. Aggregate Information: Collect data from all JSON files in ./summaries/.
  2. Generate Index: Create a global index file, MASTER_INDEX.md.
    • Content: Lists all chapters' IDs, titles, a short abstract preview, and keywords in a Markdown list or table.
    • Purpose: Provides a "bird's-eye view" for quick navigation and high-level Q&A.

Final Deliverables & File Structure

Upon completion, the following file tree is generated:

Project_Root/
├── chapters/           # 【Source Repository】Contains all split chapter texts (.md files)
│   ├── 01_Introduction.md
│   ├── 02_1_Experimental_Methods.md
│   └── ...
├── summaries/          # 【Summary Repository】Contains all structured JSON summaries
│   ├── 01_Introduction.summary.json
│   ├── 02_1_Experimental_Methods.summary.json
│   └── ...
└── MASTER_INDEX.md     # 【Global Navigation】Core document summary index

Usage Instructions for the User

For Global, Cross-Chapter Queries (e.g., “What is the paper's main thesis?”):

  • Provide the content of the MASTER_INDEX.md file to the AI. This is token-efficient.

For Specific, In-Depth Queries Within a Chapter (e.g., “What were the parameters in the 'Methods' section?”):

  • Provide the corresponding single chapter file from the chapters/ directory to the AI for full context.

版本历史

共 1 个版本

  • v1.0.2 当前
    2026-03-29 16:09 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

productivity

Weather

steipete
获取当前天气和预报(无需API密钥)
★ 445 📥 226,297
developer-tools

playwright-controller

mrchenkuan
使用 Playwright 浏览网页,支持自动等待加载、截图和文本提取。可使用 playwright:fetch 或 playwright:screenshot 命令。API:...
★ 0 📥 869
productivity

Word / DOCX

ivangdavila
创建、检查和编辑 Microsoft Word 文档及 DOCX 文件,支持样式、编号、修订记录、表格、分节符及兼容性检查等功能。
★ 438 📥 147,654