概述

Zotero Browse Skill

Read papers, search your library, and extract PDF content from a local Zotero database.

Database & Storage Locations

Database: E:\Refer.Hub\zotero.sqlite (4300+ items, 2112 stored PDFs)
PDF storage: E:\Refer.Hub\storage\{storageHash}/filename.pdf
Python stdlib sqlite3 for queries; fitz (PyMuPDF) for PDF reading

Scripts

scripts/query_items.py — Search and browse the library
scripts/read_pdf.py — Read PDF text by attachment key or title search

Both scripts are executable directly. Always use py -3 on Windows.

Common Workflows

1. Search library by keyword

py -3 scripts/query_items.py --search "FGF15"
py -3 scripts/query_items.py --search "fatty liver"

Returns: matching items with key, title, authors, date, attachment count.

2. Find attachment key for a paper, then read PDF

# Step 1: search to get the attachment key
py -3 scripts/query_items.py --search "Silibinin"

# Step 2: read the PDF (pass the attachment key shown in output)
py -3 scripts/read_pdf.py ZL42EGES

3. Read PDF by title search

py -3 scripts/read_pdf.py --search "Silibinin" --pages 5

Prompts for which attachment key to open, then extracts text.

4. Library summary

py -3 scripts/query_items.py --summary

Shows total items and breakdown by type (journalArticle, book, etc.).

5. Recent additions

py -3 scripts/query_items.py --recent 10

6. Get item details by key

py -3 scripts/query_items.py --key ZL42EGES

7. Extract full PDF text to file

py -3 scripts/read_pdf.py ZL42EGES --output extracted.txt

Database Schema

For SQL query reference (schema, table structure, query examples), see:

📄 references/schema.md

Key tables: items, itemData, fields, itemDataValues, itemAttachments, itemTypes, creators, itemCreators, tags

PDF Resolution Logic

Zotero stores PDFs at E:\Refer.Hub\storage\{storageHash}/ where storageHash is the hash from itemAttachments.storageHash. The itemAttachments.path field stores the original filename but the folder is named by storageHash.

Direct open by key:

import sqlite3, fitz, os

DB = r"E:\Refer.Hub\zotero.sqlite"
STORAGE = r"E:\Refer.Hub\storage"

conn = sqlite3.connect(DB, timeout=30)
conn.execute("PRAGMA read_only=ON")
cur = conn.cursor()

cur.execute("""
    SELECT itemAttachments.storageHash
    FROM itemAttachments JOIN items ON itemAttachments.itemID = items.itemID
    WHERE items.key = ? AND itemAttachments.linkMode = 0
""", (key,))
row = cur.fetchone()
if row and row[0]:
    folder = os.path.join(STORAGE, row[0])
    files = [f for f in os.listdir(folder) if f.lower().endswith('.pdf')]
    if files:
        pdf_path = os.path.join(folder, files[0])
        doc = fitz.open(pdf_path)
        text = "\n".join(page.get_text() for page in doc)

Tips

If database is locked, close Zotero application first
For long PDFs, use --pages N to extract just first N pages
PDF text extraction works for text-based PDFs; scanned PDFs need OCR
Use --info flag to get PDF metadata without extracting full text
Attachment keys are 8-character Zotero item keys shown in search results

版本历史

共 1 个版本

v1.0.1 当前

2026-05-03 08:20 安全安全

安全检测

腾讯云安全 (Keen)

安全，无风险

查看报告

腾讯云安全 (Sanbu)