A personal knowledge base that an LLM _compiles_, not just stores. Raw documents go in, an LLM writes trilingual (EN / 中文 / 日本語) wiki articles with [[wiki-links]], backlinks, and an emergent taxonomy. The MCP server dispatches every tool through llmwiki/operations.py; the CLI exposes the same registry via llmbase ops call; individual HTTP/CLI wrappers are being migrated onto the registry over time.
pip install llmwikillmbase (the package name and the command differ)pip install llmwiki
mkdir my-kb && cd my-kb
cat > .env << 'EOF'
LLMBASE_API_KEY=sk-your-key
LLMBASE_BASE_URL=https://your-endpoint/v1
LLMBASE_MODEL=your-model
# Optional: LLMBASE_FALLBACK_MODELS=backup-1,backup-2
EOF
cat > config.yaml << 'EOF'
llm:
max_tokens: 16384
paths:
raw: "./raw"
wiki: "./wiki"
EOF
| Command | Description |
|---|---|
| --------- | ------------- |
llmbase ingest url | Ingest a web article |
llmbase ingest pdf | Ingest a PDF (auto-chunks) |
llmbase ingest file | Ingest any local file |
llmbase ingest dir | Ingest all files from a directory |
llmbase ingest cbeta-learn --batch 10 | Corpus plugin: Buddhist canon |
llmbase ingest ctext-book 论语 /analects/zh | Corpus plugin: Chinese classics |
llmbase compile new | Compile new raw docs incrementally (3-layer dedup) |
llmbase compile all | Full rebuild |
llmbase compile index | Rebuild index + aliases |
llmbase query " | Ask a question (single-pass; add --deep for multi-step research) |
llmbase query " | 📜 classical Chinese voice |
llmbase query " | 🎓 academic voice |
llmbase query " | 👶 simple voice |
llmbase query " | 🦴 primitive voice |
llmbase query " | File answer back into the wiki |
llmbase lint check | 8-category structural health check |
llmbase lint heal | Check → fix → re-check → report |
llmbase lint deep | LLM deep quality analysis |
llmbase web | Web UI at :5555 |
llmbase serve | Agent HTTP API at :5556 |
llmbase mcp | Start MCP server (stdio) |
llmbase stats | KB statistics |
{
"mcpServers": {
"llmwiki": {
"command": "python",
"args": ["-m", "llmwiki", "--base-dir", "/path/to/my-kb"]
}
}
}
Tools exposed by the MCP server:
| Tool | Purpose |
|---|---|
| ------ | --------- |
kb_search | Full-text search over compiled concepts |
kb_search_raw | Verbatim full-text fallback over raw/ sources (v0.6.2+) |
kb_ask | Deep-research Q&A with tone modes |
kb_get | Get article by slug or alias (空, kong, emptiness all work) |
kb_list | List articles, filter by tag |
kb_backlinks | Find articles citing a given article |
kb_taxonomy | Multilingual category tree |
kb_stats | Article count, word count |
kb_xici | Guided reading (导读) |
kb_ingest | Ingest a URL |
kb_compile | Compile raw → wiki |
kb_lint | Health check / auto-fix |
kb_export / kb_export_article / kb_export_tag / kb_export_graph | Structured export for downstream projects |
All tools are declared in llmwiki/operations.py — downstream projects register custom ops via operations.register(...) and they become available on CLI + MCP automatically.
Agents mounted on this server can answer from compiled concepts, fall back to raw sources with kb_search_raw when compile glossed a detail, ingest new material mid-session, and trigger healing.
llmbase ingest url https://example.com/topic
llmbase ingest pdf ./paper.pdf
llmbase compile new
llmbase query "What are the key concepts?"
llmbase lint heal
# config.yaml
worker:
enabled: true
learn_source: cbeta # built-in: cbeta | wikisource | both; custom via register_learn_source()
learn_interval_hours: 6
compile_interval_hours: 1
health_check_interval_hours: 24
health:
auto_fix_broken_links: true
max_stubs_per_run: 10
The worker starts under the production WSGI entrypoint (wsgi.py → start_worker_thread). Deploy with gunicorn wsgi:app; llmbase web alone does not self-start the worker.
kb_search for relevant conceptskb_search_raw for verbatim detailkb_ingest with the URLkb_compile to fold it into concepts for next sessionkb_lint heals the graphkb_search (concepts) + kb_search_raw (verbatim raw sources)[[参禅]] → can-chan.md across scripts and simplified/traditionaloperations.py; CLI exposes the same registry via llmbase ops list / llmbase ops call; direct HTTP/CLI wrappers are being migrated onto the registry--file-back saves Q&A answers into the wiki so future queries benefit--tone wenyan for Chinese users (classical Chinese responses)llmbase lint heal after large ingestion batches/health has buttons for every repair op/graph — density slider for large KBs/explore — requires entities: { enabled: true } in config.envcbeta-learn, wikisource-learn, ctext-book) and the autonomous worker when enabled0.0.0.0 so LAN-accessible by default — front with a reverse proxy or bind override for public exposurePORT env) gate most mutating endpoints behind LLMBASE_API_SECRET (auto-generated if unset). Note: /api/ask is open by default and writes Q&A back via file_back; only promotion to concepts requires the secret共 1 个版本