← 返回
未分类 中文

tree-graph-rag

Guide for designing and implementing a PostgreSQL database that fuses PageIndex-style document trees with LightRAG-style entity-relationship anchors. Use thi...
设计并实现融合PageIndex文档树与LightRAG实体关系锚点的PostgreSQL数据库指南。
h4444433333 h4444433333 来源
未分类 clawhub v1.0.0 1 版本 100000 Key: 无需
★ 1
Stars
📥 430
下载
💾 2
安装
1
版本
#latest

概述

Tree-Graph Hybrid RAG

This skill teaches Claude how to build the database layer of a Tree-Graph Hybrid RAG system. It focuses on the integration seam between PageIndex-style tree output and LightRAG-style graph extraction, both stored in PostgreSQL.

Core Philosophy

  • Tree (Macro): Represents the document's native hierarchy. Gives the LLM the structural skeleton (Chapter -> Section).
  • Graph (Micro): Represents Entities and Relationships. Gives the LLM cross-document, fine-grained factual connections.
  • Fusion: Every node and edge in the Graph is anchored to a specific node_id in the Tree, enabling bidirectional traversal (from graph detail to tree context, or tree context to graph detail).

Bundled Resources

This skill includes the minimum resources needed to teach Claude the database design and data flow:

  • schema.sql: The complete PostgreSQL table definitions required for this architecture.
  • ingestion_core.py: Python script demonstrating how to flatten the Tree JSON into Postgres and how to extract graph entities anchored to the tree.
  • retrieval_core.py: Python script demonstrating the Hybrid Retrieval logic (Querying the Graph to find Tree node_ids, then extracting the macro context).
  • smoke_test.py: Minimal no-database smoke test that validates the ingestion and retrieval flow with a fake pool.
  • integration-pattern.md: Explains what this skill covers, what it intentionally does not reimplement, and where it should sit in a real service.
  • queries.md: Common SQL patterns for loading skeletons, anchoring graph hits, and assembling answer context.

Standard Workflows

1. Indexing Workflow

  1. Tree Extraction: Extract headers/TOC. Save skeleton to nodes and text to node_contents.
  2. Graph Extraction: Pass each node_contents to an LLM to extract entities and relations.
  3. Anchoring: Save entities/relations with their corresponding node_id as a foreign key.

2. Retrieval Workflow

  1. Entity/Relation Search: Extract keywords from the user query. Search the entities and relationships tables to find matching factual details.
  2. Anchor Resolution: Get the node_ids associated with the matched graph elements.
  3. Contextualization (Tree Traversal): Query the nodes table using the node_ids. Traverse up (parent_id) to gather the section titles and summaries.
  4. Content Fetch: Retrieve the full text from node_contents only for the required nodes.
  5. Synthesis: Feed the LLM a prompt containing:
    • Found Entities & Relations
    • Tree Context (e.g., "This was mentioned in Chapter 3: Financials")
    • Raw Text Chunks

Output Expectations

When this skill is triggered, prefer producing:

  1. PostgreSQL DDL or migration SQL
  2. Tree-flattening ingestion code
  3. Graph anchoring logic tied to node_id
  4. Retrieval SQL that starts from graph hits and resolves back to tree context
  5. Clear explanation of why this database design is preferable to storing one giant nested JSON blob

Developer Guidelines

  • Always enforce bone-meat separation: Never store massive text chunks in the nodes or entities tables.
  • Always maintain multi-tenancy: Ensure every query filters by workspace.
  • When users ask to implement a retrieval function, write SQL queries that join relationships -> nodes -> node_contents to demonstrate the hybrid power.
  • Do not build a full product scaffold inside the skill. Keep the focus on database design, ingestion, anchoring, and retrieval patterns.
  • Do not rewrite PageIndex or LightRAG in full inside the skill. Reuse their existing pipelines and apply this skill at the integration seam.

版本历史

共 1 个版本

  • v1.0.0 当前
    2026-03-31 10:16 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

knowledge-management

net-deep-research

h4444433333
在进行多源深度网络研究后再回答。当用户使用 /net 前缀、请求最新资讯或希望获取实时信息时使用。
★ 1 📥 580
dev-programming

Mcporter

steipete
使用 mcporter CLI 直接列出、配置、认证及调用 MCP 服务器/工具(支持 HTTP 或 stdio),涵盖临时服务器、配置编辑及 CLI/类型生成功能。
★ 196 📥 67,857
dev-programming

Github

steipete
使用 `gh` CLI 与 GitHub 交互,通过 `gh issue`、`gh pr`、`gh run` 和 `gh api` 管理议题、PR、CI 运行及高级查询。
★ 681 📥 328,769