← 返回
未分类 中文

The Librarian

Build and search lightweight quantized document indexes with TurboVec. Use when you need to create searchable indexes from documents for RAG applications wit...
使用TurboVec构建轻量级量化文档索引并进行搜索,适用于为RAG应用从文档生成可搜索索引。
rochyroch
未分类 clawhub v1.0.1 1 版本 100000 Key: 无需
★ 0
Stars
📥 358
下载
💾 0
安装
1
版本
#latest

概述

The Librarian

Lightweight document search with TurboVec quantization. Build semantic search indexes that run on minimal hardware.

Author: RandTrad Consulting — Document Intelligence for SMEs

License: MIT — Free for personal and commercial use with attribution

Lightweight document search with TurboVec quantization. Build semantic search indexes that run on minimal hardware.

What It Does

  • Builds quantized vector indexes from Markdown/text documents
  • Supports hybrid search (vector + BM25 keyword matching)
  • Optional Flashrank reranking for improved accuracy
  • Chunk expansion for surrounding context
  • 8-16x smaller indexes than FAISS

When to Use

Use CaseChoose The Librarian
-------------------------------
Resource-constrained hardware✅ Runs on Raspberry Pi, 512MB RAM
Personal knowledge base✅ Zero infrastructure
Embedded/offline deployment✅ No cloud, no database
100K+ documents on limited hardware✅ Fits where FAISS doesn't
Medical/legal records❌ Use FAISS instead
Maximum accuracy required❌ Use FAISS + Flashrank

Accuracy: ~97-98% of FAISS for 4-bit quantization. Top results may occasionally swap ranking.

Quick Start

Prerequisites

# Install BLAS library (required for TurboVec)
sudo apt install libblas3

# Create venv and install dependencies
cd /path/to/the-librarian
python3 -m venv venv
source venv/bin/activate
pip install turbovec numpy requests rank-bm25 flashrank

Build an Index

# Using the wrapper (recommended)
./scripts/librarian build /path/to/documents/ index/my_library

# With options
./scripts/librarian build /path/to/docs/ index/my_library --bits 3 --chunk-size 800

# Direct Python
LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libblas.so.3 \
  python scripts/build_index.py --input /path/to/docs/ --output index/my_library

Search

# Pure vector search
./scripts/librarian search "habit formation" index/my_library

# Hybrid (vector + BM25)
./scripts/librarian search "habit formation" index/my_library --hybrid

# Hybrid + rerank (best accuracy)
./scripts/librarian search "habit formation" index/my_library --hybrid --rerank

# With context expansion
./scripts/librarian search "habit formation" index/my_library --hybrid --rerank --expand 1

# JSON output
./scripts/librarian search "habit formation" index/my_library --json

Search Modes

ModeTimeAccuracyUse Case
--------------------------------
Vector only~130msGoodSemantic concepts, synonyms
Hybrid~140msBetterCombines semantic + exact keywords
Hybrid + rerank~320msBestMaximum precision

Bit Width Options

BitsCompressionAccuracyUse Case
---------------------------------------
4-bit8x~97-98%Default, best balance
3-bit10.7x~95-96%Tight memory
2-bit16x~93-95%Extreme compression

File Structure

the-librarian/
├── SKILL.md
├── scripts/
│   ├── librarian           # Wrapper script (handles LD_PRELOAD)
│   ├── build_index.py      # Build quantized index
│   └── search.py           # Search with hybrid + rerank
└── references/
    └── quantization.md     # How TurboVec compression works

Index Files

After building, you'll have:

index/my_library/
├── library.qindex      # TurboVec quantized index
├── chunks.json         # Document chunks with metadata
├── bm25_index.pkl      # BM25 keyword index (if rank-bm25 installed)
└── stats.json          # Build statistics

Accuracy Guidance

For critical applications (medical, legal, financial):

Use FAISS instead. The ~2-3% ranking variance in TurboVec is acceptable for personal knowledge bases, parts catalogs, and general document search, but not for applications where missing a result has consequences.

For personal/team use:

TurboVec is ideal. The accuracy difference is negligible for most queries, and the size savings enable deployment on hardware that couldn't run FAISS at all.

Performance Comparison

MetricFAISSTurboVec 4-bit
-------------------------------
Cold query~150-165ms~150-165ms
Warm query~35-40ms~130-135ms
Pure search~10-12ms~10-15ms
Index size100%~7-12%
RAM requiredHighLow

Note: Both spend ~120-140ms generating embeddings via Ollama. The search difference is minimal.

References

  • references/quantization.md - Technical details on how TurboVec compression works

Author

RandTrad Consulting — Document Intelligence consultancy for SMEs

  • Website: https://www.randtradconsulting.com
  • Contact: randtradbusiness@gmail.com
  • Services: AI Training, EU AI Act Compliance, Document RAG Systems

Built by Enda Rochford — RandTrad Consulting

License

MIT License — Free for personal and commercial use with attribution.

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files, to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, subject to the following condition:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

版本历史

共 1 个版本

  • v1.0.1 当前
    2026-05-07 13:36 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

knowledge-management

Summarize

paudyyin
智能摘要工具,自动为长文本、文档、网页生成摘要,提取要点与关键词,支持自定义摘要长度。
★ 956 📥 517,408
knowledge-management

Obsidian

steipete
操作 Obsidian 仓库(纯 Markdown 笔记)并通过 obsidian-cli 自动化。
★ 441 📥 104,639
professional

Eu Ai Act Compliance

rochyroch
依据欧盟《人工智能法案》对HR人工智能系统进行风险等级分类,提供合规差距分析,并为招聘和人力资源规划工具推荐整改措施。
★ 0 📥 434