The user has an OMem index of their work artifacts — emails, calendar
events, documents, and collaboration notes — abstracted into a unified
local wiki. The specific sources depend on the user's setup (any mail
app, any calendar source, any local folder, Teams Loop, etc.), but you
query the wiki, not the individual sources.
Call omem query first if the question references ANY of:
Don't call it for: weather, general coding, translation, math,
public-knowledge facts, or anything explicitly unrelated to the user's
own work history.
When in doubt, call it. A 0-result query is cheap; missing a hit the
user needed is expensive.
OMem returns work memory at four cost tiers. Stop as soon as you have
enough to answer.
| Level | Command | Token cost | Use when |
|---|---|---|---|
| --- | --- | --- | --- |
| L0 abstracts | omem query " | ~2k total | Always first |
| L1 wiki page | omem page get | ~500–2000 | After reranking abstracts, fetch the few you judge most relevant |
| L2 parsed.md | omem raw get → Read | ~2k–10k | L1 wiki missing a specific detail (table row, slide quote, contract clause) |
| L3 original file | omem raw get → Read | varies | Only when binary format matters; usually unnecessary |
Default path: L0 (limit 20) → read all abstracts → rerank → fetch
the 1–3 you judge relevant at L1 → stop, unless the body lacks a needed
detail (drop to L2) or references something you must resolve first
(recurse — see "Follow threads depth-first" below).
score is a coarse signal, not the answeromem query orders hits by a backend score (BM25, or qmd's hybrid
embedding+BM25). That score is good at **finding plausibly-related
pages** — it is NOT a final relevance judgment. It doesn't know what
the user means by "the document", who they care about, or which of
several similar pages is the right version. You do.
Treat the score as candidate generation and the abstracts as the
actual signal. The default workflow:
omem query --limit 20 — a wide candidate pool, not a narrow one.abstracts ≈ 2k tokens total — cheap relative to the cost of missing
the right page and having to re-query.
asking. This often means picking page #5 over #1, or skipping the
top hit entirely because its abstract is on the wrong topic — the
score got it into the pool, your judgment decides if it stays.
omem page get for the ones you chose.Shortcut allowed: if the top 2–3 scores are all >0.7 AND their
abstracts cleanly match the question's intent, fetch them directly
without reading all 20. Don't force-read 20 abstracts when 3 are
obviously right — but when in doubt, read more, not fewer.
If the first 20 look thin — top scores all <0.5, OR all abstracts
clearly off-topic — don't give up after one shot. Do one round of
broadening first:
quarter financial review")
--kind / --since filters you set--limit 50 and read further downA second round costs ~2k tokens and often surfaces what the first
round's tokenizer missed. Give up only after one honest retry.
If still nothing relevant after a retry: 0 hits is a valid answer.
Say "OMem didn't surface anything specific about this" and answer from
general knowledge if appropriate.
Most questions are answered by the first page you open — read it, cite
it, done. That's the common case, and you should not go looking for
reasons to query again.
But sometimes a page doesn't contain the answer so much as point at
it: its body references a project, person, document, or decision that
you'd have to understand to actually answer, and that thing lives on
another page. When — and only when — that happens, treat memory as a
graph you can recurse through: query for the missing piece, read
it, and come back up to assemble the answer. It's how a colleague who
knew the history would work — recall the meeting, realize it hinged on
an earlier contract, go pull that contract — each step prompted by a
real gap, not by habit.
So after reading a page, ask one question: *can I now answer what the
user actually asked, with specifics?*
and good outcome — don't second-guess it.
and I haven't resolved it** → query for that one thing, then re-ask
the same question. This is the only trigger for another round.
What counts as a load-bearing gap worth one more query:
Atlas migration", "per the Q2 risk review")
the signed MSA") — fetch that page
after the March incident")
A passing mention you don't need to resolve is not a gap — if the
user's question is answered without it, ignore it and stop. Depth-first
means following the one thread that leads to the answer, not crawling
the graph. Most chains are zero or one extra hop; if you're several
hops deep and still digging, you've likely drifted — summarize what you
found and tell the user where the trail went cold. (Flow F shows a case
that genuinely needed two hops.)
omem query "" --format json --limit 20 Returns ranked hits as JSON:
[
{"page_id": "a1b2c3d4...", "score": 0.92, "kind": "file",
"title": "Q3 budget review", "abstract": "Finance team weekly sync on Q3 budget; decisions on headcount and capex…",
"source_uri": "file:///Users/.../Q3-budget.pptx"}
]
score is the backend's coarse relevance estimate (higher=better,
already sorted). Use it to bound the candidate pool, not to pick
the answer — see the "Rerank the abstracts yourself" section above.
Do NOT compare scores across backends; rank order within one query
is what's meaningful.
Optional filters:
--kind file|mail|calendar|loop — narrow to one kind--since 2026-04-01 (or 7d_ago / 2w_ago) — only items modified after that date--source NAME — narrow to one source plugin (local-files, mail-app, outlook-web, …)--account ACCT — narrow to one mail/calendar account--include-deleted — surface tombstoned pages too (audit / recovery; default off)--limit N — default 20; bump to 50 on a retry round if 20 looks thinIf query syntax surprises you (mixed Chinese/English, special
characters, dates that don't match), read references/query-syntax.md.
omem page get Prints the full wiki page (frontmatter + LLM-curated body). This is the
canonical answer source — written by OMem's curator from the parsed
artifact. Most queries end here.
omem raw get --parsed Prints the absolute path to parsed.md — the parser's full output
(every table row, every slide, every email body, OCR / VLM image
descriptions). Use Read on this path when the wiki body lacks a
specific detail.
omem raw get Prints the absolute path to the original file. Only useful when the
binary format matters (e.g. you need to preserve a .pptx for the
user). Most agents never need this.
omem wiki ls [filters] --format jsonDeterministic metadata-only list — no scoring, no tokenization. Use
for "a slice" or "every row matching X" (as opposed to "the pages
most relevant to X" — that's omem query).
Filters combine AND:
--kind file|mail|calendar|loop--source NAME (local-files / mail-app / outlook-web / …)--account ACCT--since 7d_ago (or ISO date)--search "" — SQL LIKE substring on title OR abstract, case-insensitive--limit N — default 50; --limit 0 = unlimited (use sparingly)--include-deleted — surface tombstoned rows (audit / recovery)omem query vs omem wiki ls --search: query is fuzzy / semantic
and returns top-N by relevance; wiki ls --search is literal substring
and returns every matching row by recency. Question → query; listing
→ wiki ls.
All example queries below use fictional names — substitute the real
context the user actually mentions.
Flow A — "What did Alice and I decide about the vendor renewal?" (rerank in action)
omem query "Alice vendor renewal decision" --format json --limit 20for a meeting whose abstract just says "discuss vendor renewal" —
no decision content. Hits #4 (score 0.71) and #7 (score 0.58)
have abstracts that explicitly say "decided to extend by 12 months
at unchanged rate" and "Alice agreed to revisit terms in Q4".
Those are the real matches.
omem page get <#4> and omem page get <#7>. [source: …] paths. Stop at L1.
Why this matters: the top-score hit was a plausible match (right
people, right topic word) but not a useful one — only an abstract
read revealed which pages actually carry decision content.
Flow B — "What's the exact SLA penalty in the Acme service agreement?" (drill to L2)
omem query "Acme service agreement SLA penalty" --format json --limit 20V2 markup, V5 final). Their abstracts mention a penalty exists
but don't quote the number.
omem page get .
but doesn't quote the verbatim number.
omem raw get --parsed → Read the parsed.md → grep for"penalty / SLA / per day / %".
Flow C — "List my meetings from last week" (metadata slice)
omem wiki ls --kind calendar --since 7d_ago --limit 50 --format jsonomem page get only if user asks about a specific meeting'sdetails.
Flow C2 — "List every meeting whose title mentions Atlas" (literal slice)
omem wiki ls --kind calendar --search "Atlas" --limit 0 --format json —
unlimited rows, title/abstract LIKE match. Use this instead of
omem query when you want every matching row, not BM25 top-N.
Flow D — "Tell me about Python list comprehensions"
Don't call this skill. Answer from general knowledge.
Flow E — "Did we ever talk about the API rate-limit issue?" (retry round)
omem query "API rate limit" --format json --limit 20public API rate card, a generic throttling reference. No real match.
phrasing: omem query "throttling 429 too many requests" --limit 20.
specific about this — could be a recent discussion not yet ingested,
or it lived in a source OMem isn't configured to read."
Flow F — "Why did we end up switching the Atlas project to the new vendor?" (recurse only because the answer needs it)
omem query "Atlas project vendor switch" --format json --limit 20omem page get .covered in the Q2 incident review, and per Bob's recommendation" —
two unknowns that the answer hinges on. Don't answer yet; recurse.
omem query "Q2 incident review SLA breach" --limit 20 → omem page get : the incident review quantifies the breach
(repeated downtime past the contractual cap). That's the why.
omem query "Bob vendor recommendation Atlas" --limit 20 →the abstract confirms Bob proposed the specific replacement vendor
and the rationale. Enough.
by the SLA breach (from the incident review) plus Bob's replacement
proposal (from his recommendation), formalized in the decision log.
Cite all three [source: …] pages.
Why this needed the hops: the first page mentioned the decision but
didn't explain it, and the explanation was exactly what the user
asked for — so the two references were load-bearing, not incidental.
Note the contrast with Flow A, where the first page fully answered the
question and stopping there was correct. The test is always the same:
does answering require this? — not *is there something else I could
go read?*
When OMem provided the answer, always cite:
[source: ] [source: ] This lets the user verify and rebuilds trust each turn.
omem setup, omem install, omem ingest, omem lint, `omem index rebuild, omem plugin enable/disable, omem config set`. These modify
the user's environment. If the user clearly needs them, tell them in
plain text: "Looks like OMem isn't set up — run omem setup in your
terminal."
parsed.md or original files before trying omem page get.L1 wiki is almost always enough; L2/L3 are last-resort.
only; the absolute number isn't normalized.
Just answer from general knowledge.
facts). Bad precision degrades user trust in the skill.
Relay errors to the user — don't try to fix them yourself. Common
cases (no config found → user runs omem setup; index empty → user
runs omem ingest; slow query → qmd cold-loading models, wait).
See references/troubleshooting.md for the full table.
references/query-syntax.md — tokenization, date filters, score semantics, retry strategyreferences/output-schemas.md — full JSON schemas for omem query and omem wiki lsreferences/troubleshooting.md — error messages and what to relay to the user共 1 个版本