微信聊天分析器 (wechat-analyzer)

A Flask web app at ~/Desktop/wechat-analyzer/ that pulls WeChat data via wx-cli, analyzes private/group chats, and renders a dark-theme dashboard with identity detection, relationship metrics, and summary views.

前置条件

使用本工具前需要先准备好以下环境：

1. 系统与微信

macOS（仅支持 macOS，依赖 WeChat.app 的本地数据库）
微信已安装并登录（macOS 版，需保持登录状态）

2. 安装 wx-cli

wx-cli 是一个 Rust 二进制工具，用于读取微信本地数据库。推荐两种安装方式：

npm 安装（推荐，全平台）

npm install -g @jackwener/wx-cli

或 curl 一键安装

curl -fsSL https://raw.githubusercontent.com/jackwener/wx-cli/main/install.sh | bash

验证安装：

wx --version

3. macOS 初始化（只需一次）

微信使用 SQLCipher 加密本地数据库，wx-cli 需要从微信进程内存中提取密钥。macOS 上需要先对 WeChat.app 做 ad-hoc 签名：

# 1. 签名微信（WeChat 更新后需重做）
codesign --force --deep --sign - /Applications/WeChat.app

# 2. 清理旧 TCC 授权（重签后必做，否则权限可能失效）
for s in ScreenCapture Camera Microphone AppleEvents \
         SystemPolicyDocumentsFolder SystemPolicyDownloadsFolder SystemPolicyDesktopFolder; do
  tccutil reset "$s" com.tencent.xinWeChat
done

# 3. 重启微信
killall WeChat && open /Applications/WeChat.app
# 等待微信完全登录

# 4. 初始化密钥
sudo wx init

> 已知副作用：重签后 macOS 可能频繁弹「微信」想访问其他 App 的数据，这是 ad-hoc 签名后 code identity 改变导致的。点「允许」即可放行。

4. 验证 wx-cli

初始化后，验证能否读取会话列表：

wx sessions

能看到最近会话即表示一切正常。daemon 在首次调用时自动启动，无需手动启动。

5. 配置 config.json

项目目录下的 config.json 需要填写以下内容：

{
  "db_dir": "/Users/你的用户名/Library/Containers/com.tencent.xinWeChat/Data/Documents/xwechat_files/你的用户名_哈希/db_storage",
  "llm": {
    "api_key": "***",
    "base_url": "https://api.deepseek.com/v1",
    "enabled": true,
    "model": "deepseek-chat",
    "provider": "deepseek"
  },
  "user_nickname": "你的微信昵称"
}

db_dir: 微信本地数据库路径（可在微信 → 设置 → 文件管理查看实际路径）
llm.api_key: LLM API Key（推荐 DeepSeek，也支持 OpenAI 兼容接口）
user_nickname: 你在微信中的昵称（用于 AI 识别消息中的「我」）

6. Python 环境

需要 Python 3.8+。推荐使用 uv 或 venv 管理依赖：

# 使用 uv（推荐）
uv venv
uv pip install flask requests

# 或使用 pip
python3 -m venv venv
source venv/bin/activate
pip install flask requests

7. 验证就绪

完成以上步骤后，确认环境就绪：

# 验证 wx 可工作
wx sessions -n 3 --json

# 验证 Python 依赖
python3 -c "import flask; print('Flask OK')"

wx-cli 工具参考

wx-cli 是本工具的数据基础，以下是与 wechat-analyzer 配合使用的常用命令和数据格式。

核心命令

wx sessions                                       # 最近 20 个会话（含 chat_type）
wx unread --filter private,group                  # 有未读消息的真人会话
wx new-messages                                   # 上次检查后的新消息（增量）
wx history "姓名" -n 200 --json                   # 拉聊天记录
wx history "姓名" --since YYYY-MM-DD --until YYYY-MM-DD -n N --json
wx search "关键词"                                 # 全库搜索
wx contacts                                       # 联系人列表
wx contacts --query "关键字"                       # 按名字搜索
wx stats "群名" --json                            # 群聊统计
wx daemon status / stop / logs --follow           # daemon 管理

数据拉取反模式

大 -n（500+）频繁超时 → 用 --since/--until 按半月分批，每批 -n 5000
pipe-to-python3 同一 shell 中超时率高 → 分两步：先 > file，再 python3 读文件
-n 0 不会返回"全部" → 返回 0 条，用 -n 99999 取全量

JSON 格式说明

wx history --json 返回平面数组（不是 {messages: [...]}）
私聊：对方 sender: ""，用户 sender: "{{用户昵称}}"
群聊：sender 优先使用群昵称（群名片），附带稳定身份三件套：
sender_username：稳定 wxid
sender_contact_display：通讯录显示名
sender_group_nickname：群名片
时间格式：YYYY-MM-DD HH:MM
type：文本 / 图片 / 语音 / 链接 / 文件
会话/消息输出都带 chat_type：private / group / official_account / folded

实时聊天 Coaching 规则

当用户需要逐轮聊天分析和回复建议时，遵循以下规则：

共情 > 分析 — 吐槽时陪骂，不讲道理。不要回复建议/产品/解决方案，先接住情绪
不要进食推送 — 对方说「不吃」3 次 = 立刻换话题
深夜别再追 — 说了去睡就别发，说了到家就别问
社交时不催 — 对方在外面时别问「到家没」
不要解惑模式 — 对方分享体验时说「不舒服就别用了」，不要上课
先回答问题，再用案例印证 — 输出格式：一句话结论 → 具体话术 → 为什么

回复建议格式：给出 3 条具体话术加理由，不是泛泛建议。需要用到详细的心理学分析模式时，见 references/ 下的相关文档。

关系诊断维度

消息比、日均量、月度趋势
深夜占比（>25%=亲密信号，>35%=过度依赖）
响应速度（>70%秒回=极高投入）
每天谁先说话（100%一方=追/被追不平衡）
话题热度分布、情感/关心关键词密度

Daemon 故障排查

症状	原因	修复
------	------	------
`wx-daemon 启动超时（>15s）`	WeChat 持有数据库文件锁	延长超时 `WX_DAEMON_TIMEOUT=60 wx sessions`，或重启 WeChat
「预热完成，联系人 0 个」	密钥过期	`sudo wx init --force`
「无法解密 session.db」	密钥过期	`sudo wx init --force`
「读取密钥文件失败: No such file」	CWD 与 init 目录不一致	`ln -s ~/.wx-cli/all_keys.json /all_keys.json`

根因修复序列：

codesign --force --deep --sign - /Applications/WeChat.app
killall WeChat && open /Applications/WeChat.app
# 等待登录后
sudo wx init --force

Quick Start

cd ~/Desktop/wechat-analyzer
python3 server.py
# → http://localhost:8899 (默认无密码，可在设置中开启密码保护)
# → LAN: http://192.168.31.121:8899

Architecture

~/Desktop/wechat-analyzer/
├── server.py          # Flask app, all API routes (LLM + settings + auth)
├── analyzer.py        # Core engine: wx-cli wrappers, analysis, identity, indices, signals
├── llm.py             # LLM integration — 12 AI analysis functions + _sample_convo helper
├── config.json        # LLM config + password settings (gitignored — keep local only)
├── config.example.json# 公开的配置模板（git 追踪，实际配置填到 config.json 后复制改名）
├── README.md          # GitHub 仓库主页简介
├── .gitignore         # 排除 config.json / all_keys.json / 构建产物
├── usage_stats.json   # LLM usage stats (total_calls, tokens, by_function, recent calls)
├── custom_tags.json   # Manual identity tag overrides
├── docs/
│   └── index.html     # GitHub Pages 项目展示页（暗色冰蓝主题，WebGL 背景，匹配 app 风格）
└── templates/
    └── index.html     # Single-page dark-theme frontend (~2800 lines after 2026-05 refactor)

2026-05-23 Refactor: The summary dashboard has been removed. The page now opens with

a header + search panel + 3 capability preview cards (no auto-loading, no overview stats,

no topic leaderboard). Flow: search → select contact → click "开始分析" → see results.

2026-05-28 Startup Page Redesign: Header de-gradiented, search panel refined, capability

preview cards added. See references/startup-redesign.md.

API routes in server.py:

GET /api/contacts?q= — search contacts
POST /api/analyze — full analysis (private/group)
GET /api/summary?range=today|3d|7d — summary (backend still exists but frontend no longer calls it on page load)
GET/POST/DELETE /api/tags — custom identity tags
GET/POST /api/config — LLM settings
GET/POST /api/usage — LLM usage stats (GET returns stats, POST with {action:"reset"} clears)
GET/POST /api/settings — password settings
POST /api/llm/signal|reply|insight — original 3 AI dimensions
POST /api/llm/topics|todos|emotion-track — private chat extended AI dimensions
POST /api/llm/group-topics|group-members|group-vibe|group-signals|group-trace|group-roles — group chat AI dimensions

Removed features (2026-05-23)

The following were deleted from templates/index.html (~500 lines removed):

Summary dashboard HTML (#summaryDashboard with time selector, stats, tabs, cards, leaderboard)
Fixed loading overlay (#loadingOverlay with spinner + timer + skip button)
"← 返回摘要看板" button
All summary JS functions: loadSummary, showSummary, switchTab, applyFilter, setFilter, resetSummaryFilter, renderSummary, renderPrivateCards, renderGroupCards, renderGlobalTopics, renderGlobalTopicList, switchGlobalTab, showTopicDetail, switchTopicTab, jumpToAnalysis, startLoadingTimer, stopLoadingTimer, updateLoadingTimer
Auto-load calls: DOMContentLoaded + document.readyState check both removed
Summary event delegation for card clicks
Summary CSS blocks: .summary-dashboard, .time-selector, .summary-overview, .summary-stat, .summary-filter-bar, .summary-tabs, .summary-tab, .summary-section-title, .summary-grid, .summary-card, .summary-unread, .summary-row, .todo-list, .todo-item, .todo-priority, .todo-snippet, .topic-tags, .topic-tag-sm, .signal-badge, .reply-hint-box, .global-topics, .global-topic-*, .recent-preview, .summary-loading, .loading-overlay, .loading-spinner, .loading-text, .loading-dots, .loading-timer, .back-to-summary

Kept (still used by analysis results):

toggleTopics(), editIdentityTag(), deleteIdentityTag()
Topic rank CSS (.topic-rank-*, .topic-hidden, .topic-expand-btn)
Screenshot button (.screenshot-btn)
Analysis spinner (#spinner)

AI Analysis Cards (redesigned 2026-05-23)

After analysis completes, autoRunLLM() runs all 6 private-chat dimensions sequentially

and renders each as a styled card with per-dimension color scheme. Cards appear below

the stats/charts section. No manual buttons — the button bar was removed.

Private Chat Card Layout

Each dimension renders as an .ai-card with a title bar and styled body:

Dimension	Title Color	Card Class	Visual Style
-----------	-------------	------------	--------------
🔍 信号解读	pink	`.ai-card.signal`	2×2 grid: emotion/needs/risk(tinted)/tips + reply bar (green/red)
💬 回复建议	green	`.ai-card.reply`	Context note + 3 numbered replies with gradient circles, style tags, italic reasons
📊 关系洞察	purple	`.ai-card.insight`	Left-accented quote block, lavender text
🗣️ 话题挖掘	blue	`.ai-card.topics`	Bullet lines as gradient cards with hover → pink border
📌 待办提取	orange	`.ai-card.todos`	Priority badges (red/orange/grey, weight 800) + who + item + context
📈 情绪追踪	fuchsia	`.ai-card.emotion`	Same quote style as insight, different border color

Typography: titles have letter-spacing, values weight 500, secondary info weight 400,

priority badges weight 800 uppercase.

Loading Animation

During analysis, a pulsing blue dot + animated trailing dots show:

<div class="ai-loading-header">
  <div class="ai-loading-pulse"></div> AI 深度分析中<span class="ai-loading-dots"></span>
</div>

The pulse dot scales 0.8→1.3 with opacity 0.2→1 over 1.2s. After all dimensions

complete, the loading header is removed from the DOM.

Group Chat (6 dims — JSON-structured, graphically rendered)

Group dimensions return structured JSON (not free text). The frontend renders

graphical components (.gv-* CSS classes, 70+ lines of visualization styles).

🔥 Topic leaderboard → POST /api/llm/group-topics

```json

[{"rank": 1, "topic": "话题名", "pct": 30, "trend": "↑", "color": "#58a6ff",

"keywords": ["kw1","kw2"], "sample": "一句原文"}]

```

Rendered: rank number + colored progress bar + ↑↓→ trend + keyword tags + italic sample quote.

👥 Member profile → POST /api/llm/group-members

```json

[{"name": "成员名", "tier": "核心", "msg_share": 35, "style": "话痨",

"icon": "🔥", "color": "#f85149", "desc": "一句话风格描述"}]

```

Rendered: circular avatar card (colored bg + emoji) + tier badge + share progress bar + %.

🎭 Vibe assessment → POST /api/llm/group-vibe

```json

{"mood": "轻松", "mood_emoji": "😄", "score": 8, "conflict": false,

"conflict_detail": "", "description": "一句话", "color": "#3fb950"}

```

Rendered: tinted card + large emoji + big score number + conflict warning if true.

📡 Signal radar → POST /api/llm/group-signals

```json

[{"type": "通知", "icon": "📢", "priority": "高", "content": "活动/聚会摘要", "color": "#f85149"},

{"type": "分享", "icon": "🔗", "priority": "中", "content": "推荐链接/文章摘要", "color": "#d2991d"},

{"type": "求助", "icon": "❓", "priority": "高", "content": "问题或求助", "color": "#3fb950"},

{"type": "亮点", "icon": "💡", "priority": "低", "content": "有趣讨论/段子", "color": "#a371f7"},

{"type": "待办", "icon": "📌", "priority": "中", "content": "未落实事项", "color": "#58a6ff"}]

```

Categories designed for casual WeChat groups (not formal "公告/决策"). Max 6 items, 300 messages sampled. LLM instructed: "宁缺毋滥，没发现就返回[]".

👤 My trace → POST /api/llm/group-trace

```json

[{"type": "@我", "icon": "📌", "content": "谁说了什么", "time": "12:30", "color": "#f85149"}]

```

Rendered: colored circle icon + timestamp + content. Types: @我/参与/错过, each with distinct color.

🧩 Role map → POST /api/llm/group-roles

```json

[{"role": "意见领袖", "icon": "👑", "members": ["name"], "color": "#d2991d", "desc": "why"}]

```

Rendered: tinted role card + icon + role name (colored) + member names + description.

Fallback: If LLM returns non-JSON, try/except passes the raw string through

and frontend renders it as plain text (.replace(/\n/g, " ")).

Group Panel Layout (no tab buttons)

The group chat AI panel has NO horizontal tab buttons. Unlike the private chat

panel (which keeps 6 row buttons for manual re-query), the group panel is pure

vertical stacking — all 6 dimensions auto-load and display top-to-bottom.

Design decisions:

renderLLMPanel() for groups skips the llm-btns div entirely — only
+
.
llm-result has NO max-height — it grows naturally with content. white-space: pre-wrap removed.
Dimension section headers compact for groups: margin-top:10px;padding-top:6px vs private's 14px/10px.
CSS compactification: all .gv-* styles reduced ~25-30% (see table below).
After auto-run, panel._fullResult is saved. Manual re-query via runLLMDim() works if needed.

Component	Before	After
-----------	--------	-------
Topic row pad	10px	6px
Topic bar h	8px	6px
Member avatar	40px	32px
Vibe emoji	2.2rem	1.6rem
Signal padding	8px	5px
Trace icon	28px	24px
Role pad	10px 14px	6px 10px

Frontend rendering: in autoRunLLM(), each group dim has a dedicated else if

branch that parses the JSON array/object and builds HTML with the .gv-* CSS classes.

The rendering handles empty arrays gracefully (shows "未发现"/"暂无足迹"/"无数据").

All dim- divs are populated independently, so failures don't cascade.

`runLLMDim()` — single-dimension manual re-query

When a user clicks an individual dimension button, runLLMDim(dim, chatName) fires.

It delegates rendering to renderDimSlot(dim, data, result) — the same shared

function used by autoRunLLM. All 6 group dimensions get full graphical rendering

(topic bars, member avatars, vibe cards, signal badges, trace icons, role cards).

After showing the single dimension, a ← 显示全部 button is prepended. Clicking it

calls restoreFullResult() which restores panel._fullResult (saved at the end of

autoRunLLM()).

`renderDimSlot()` — shared rendering function

Centralizes all dimension-specific HTML rendering. Called by both autoRunLLM (via

renderDimSlot(dim, d.data, slot)) and runLLMDim. One place to add new visual

components — no duplication between auto-run and manual-click paths.

Groups dimensions: group-topics, group-members, group-vibe, group-signals,

group-trace, group-roles — each with dedicated .gv-* CSS classes.

Generic fallback: for unknown dims or non-JSON responses, renders as plain text or

compact JSON string.

Adding a new AI dimension

Add LLM function in llm.py — use _sample_convo(messages, max_n=200) helper, return None if not is_available()
Add endpoint in server.py — standard POST pattern
Add entry to the DIMS array in autoRunLLM() (index.html)
Add rendering case in autoRunLLM's per-dimension router (the if (dim === ...) chain)

`_sample_convo()` helper (llm.py)

Uniform message sampling for group or generic use. Returns (convo_str, total_count).

Sender="" → "?", no identity mapping. For private chats, prefer _format_private_convo().

def _sample_convo(messages, max_n=200):
    # Evenly sampled, formatted as "[HH:MM] sender: content[:120]"
    # Non-text types mapped to [语音]/[图片]/[表情]/[通话]

`_format_private_convo()` — USE FOR ALL PRIVATE CHAT AI PROMPTS (llm.py)

The canonical way to format private chat conversations for LLM input. Returns (convo_str, total_count, preamble_str).

Critical — identity labeling: sender="" or sender=contact_name → labeled as the contact's actual name (e.g. "{{联系人姓名}}"). sender=user_name → labeled as "我". The preamble explains who is who:

对话身份：「{{联系人姓名}}」是联系人，「我」是用户（后台使用人 {{用户昵称}}）。
[22:15] {{联系人姓名}}: 今天好累啊
[22:16] 我: 怎么了

This replaces the old ambiguous format where empty sender → "对方" and AI had no idea who "对方" was.

Every private-chat AI function in llm.py must use this helper — the preamble goes directly into the system prompt. All 8 private-chat functions now use it: analyze_signal, suggest_reply, analyze_chat_insight, analyze_topics_ai, extract_todos_ai, track_emotion_ai, generate_advice_ai, generate_phase_insight.

Each function accepts contact_name and user_name parameters. Server endpoints pass them via chat_name and user_name fields in the JSON body. Frontend stores _lastUserName from data.user_name and passes it in all API calls.

Settings Page

Four-tab layout in the ⚙️ settings modal (👤 个人信息 opens first by default):

Tab	Content
-----	---------
👤 个人信息	User's WeChat nickname (e.g., \"{{用户昵称}}\"). AI uses this to identify \"who am I\" in chats. Stored as `user_nickname` in config.json.
🤖 大模型 API	Provider/Base URL/Model/API Key + enable toggle + test button
🔐 安全设置	Password enable toggle + change password (min 3 chars) + confirm
📊 用量统计	LLM call stats: total calls, tokens (input/output), cost estimate, per-function breakdown table, refresh + reset buttons

Settings stored in config.json:

{
  "llm": { ... },
  "password": "",
  "password_enabled": false,
  "user_nickname": ""
}

When password_enabled: false, the login_required decorator skips all auth

checks. Login page redirects to index. User can still access settings to re-enable.

Endpoints: GET/POST /api/settings (password is always masked as * in responses).

Nickname Flow Through the System

User sets nickname in ⚙️ → saved via POST /api/settings with {user_nickname: "your_nickname"}
server.py _save_server_config(user_nickname=...) persists to config.json
Frontend saveProfile() stores in window._configuredNickname
renderResults(): window._lastUserName = window._configuredNickname || data.user_name || ""
renderLLMPanel() stores panel._userName = window._lastUserName
autoRunLLM() passes uname in all private-chat API calls: {..., user_name: uname}
Server endpoints extract user_name and pass to LLM functions
_format_private_convo() uses it to label messages: sender=user_name → \"我\"

Priority: configured nickname > auto-detected from messages (data.user_name). If

user hasn't configured a nickname, the system falls back to the first non-empty sender

found in the chat history. Without this setting, group chat dimensions like \"我的足迹\"

and \"信息雷达\" cannot accurately identify which messages are from the user.

API Contract: username field

The /api/analyze endpoint accepts an optional username field alongside contact. When provided, analyze_group() uses username (the wxid or xxx@chatroom ID) for wx history queries instead of the display name.

Why this matters

wx history matches by username (the WeChat internal ID like 57931515500@chatroom), not by display name. Display names can contain special characters or have been renamed, causing wx history to fail with "找不到...的消息记录".

Data flow

Frontend: selectedContact.username → POST /api/analyze {contact, username, ...}
Server:   analyze_group(contact, since, until, username=username)
          query = username or contact_name  # username wins if provided
          cmd = f'wx history {_q(query)} --since {since}...'

Adding username to a new feature

When making a new API call that uses wx history, always pass the username

... [OUTPUT TRUNCATED - 9289 chars omitted out of 59289 total] ...

vy → near-black, #09090c base)

Mouse radial ripple: sin(md12 - t3.5) exp(-md3.5) — distance-based decay
Domain warp: 3-iteration sin/cos feedback loop for slow fluid drift
Highlight sparkle: pow(sin(...), 7.0) for occasional bright specks

CSS setup:

#bg-fluid {
  position: fixed; inset: 0; z-index: 0;
  pointer-events: none; opacity: 0.55;
}
.app { position: relative; z-index: 1; }

Graceful fallback: if WebGL context is unavailable, the canvas is hidden via canvas.style.display='none'. No visual breakage.

Tuning: adjust opacity 0.35–0.7 in CSS. Shader is ~70 lines of JS (inline . The matched old_string may include the original ending, leaving it doubled.

tags self-close, so count("") in grep checks.

🔴 设置齿轮不要用 position: fixed 漂在右上角: 设置齿轮（⚙️）属于 .header-top 内的视觉元素，必须放在 .header-top flex 容器中，使用 margin-left: auto 推到右侧，与 h1 标题保持在同一行。❌ 错误做法：position: fixed; top: 20px; right: 24px; 放在 .header 外部做独立 span。✅ 正确做法：

```html

```

```css

.settings-gear {

margin-left: auto;

display: inline-flex;

...

}

```

这条规则适用于所有 header 级操作图标——属于行内元素就用 flex 布局放在行内，不要用 fixed/absolute 漂离容器。

🔴 HTML 改了但浏览器不生效: server.py 虽然有 Cache-Control: no-store 响应头，但浏览器仍可能缓存已加载页面的 JavaScript 执行上下文。改了 templates/index.html 并重启服务后，用户看到的仍是旧版前端行为（旧 JS 逻辑在内存中）。必须让用户 强制刷新（Cmd+Shift+R）或 清空缓存并硬性重新加载。验证方法：在浏览器 Console 执行 allContactsCache.length 确认缓存数量（应该是 ~650 而不是旧的数字），或检查 renderSuggestions 函数体是否包含 if (!query) 守卫。
Password auth toggle edge case — when password_enabled=false, settings page is still accessible (login_required passes through). User can re-enable auth from settings.
Identity dimension overlap — when adding identity-specific dimensions, verify no specific dim overlaps with a universal dim (e.g., family's original "情感表达" overlapped with universal "关系温度").
🔴 replace_all=true on overlapping function boundaries — when two consecutive functions share similar code patterns (e.g., group_signal_radar and group_my_trace both start with def ... and end with return {"...": content.strip()}), patch(mode='replace', replace_all=True) with context that spans function boundaries can match the wrong function, duplicating or garbling code. The diff may show except: pass lines removed, duplicate variable assignments, or garbled string fragments. Fix: after such a patch, read_file the entire affected region and verify every function has correct structure. For cross-function patches, use execute_code with exact line-number targeting instead of replace_all.
Signal radar category design — for casual WeChat groups, use categories the LLM can actually find: "通知/分享/求助/亮点/待办". Avoid formal categories like "公告/链接/决策" that don't exist in casual chat. Also increase message sample size (200→300) and add explicit rules: "只提取真的有用的信息，不要硬凑", "content要具体（包含谁、什么事）", "宁缺毋滥，没发现就返回[]". Signal radar being empty is often a prompt-design problem, not a data problem.
🔴 Shell injection in _run_wx() — calling subprocess.run(cmd, shell=True) with user-controlled input (e.g. search query) is a shell injection vulnerability.
❌ WRONG fix: shlex.split(cmd) + shell=False. This breaks all commands that use double-quote wrapping (e.g. f'wx history "{name}" --since ...') because shlex.split() fails with ValueError: No closing quotation on complex shell strings.
✅ CORRECT fix: Keep shell=True, but wrap ALL dynamic parameters with shlex.quote(). Replace f'..."{variable}"...' with f'...{_q(variable)}...' where _q = shlex.quote. This keeps shell quoting intact while safely escaping metacharacters.
Helper pattern (add near imports in analyzer.py): def _q(s: str) -> str: return shlex.quote(s)
Affected patterns: name, contact_name, group_name, c_name, search_word, query — 13 call sites across the codebase.
Imports needed: import shlex, import time (for sessions cache TTL).
🔴 Settings modal double-active — if both panelProfile and panelLlm have class="settings-panel active" in HTML, both display simultaneously when modal opens. switchSettingsTab("profile") in toggleSettings() fixes it at runtime, but the markup should only have active on the default tab (panelProfile). Remove active from panelLlm.
Summary loading failure is silent — loadSummary() catch block only did console.error. Must also call showError() so the user sees a visible error message, not just a blank dashboard.
Enter on direct input forced private chat — pressing Enter without selecting a suggestion hardcoded chat_type: "private" and auto-ran analysis. Fix: don't auto-run. Set selectedContact with chat_type: "" and enable the button, let the user click manually.
No :focus-visible styling — keyboard navigation had no visible focus indicator. Fix: add global :focus-visible { outline: 2px solid var(--accent); outline-offset: 2px; } with specific overrides for elements that use border-color instead (use box-shadow: 0 0 0 2px var(--accent) to avoid double ring).
🔴 scores["friend"] not assigned: _detect_identity() computes friend_score and sets features_map["friend"], but must also set scores["friend"] = friend_score. Forgetting this means the "朋友" identity never appears in the frontend's "备选身份评分" list. Check ALL 6 identities (lover/family/business/colleague/service/friend) have BOTH scores[key] and features_map[key] assigned.
🔴 Group LLM analysis only gets 15 messages: analyze_group() returns recent_messages (last 15, content truncated to 80 chars) but the frontend's _lastMessages falls back data.messages || data.recent_messages. For group analysis, data.messages was missing, so LLM only saw 15 truncated messages. Signal radar and my-trace need more context to produce results. ✅ Fix: analyze_group() now returns a messages field with up to 100 messages (full content, not truncated to 80 chars), formatted as [{sender, content, time}]. The frontend picks this up via data.messages and passes 100 messages to all LLM dimensions.
🔴 innerHTML += destroys DOM references in async loops: When building cards in a loop and making async fetch calls per card, each result.innerHTML += cardHtml destroys and recreates ALL previous DOM nodes. Any slot variables obtained via getElementById before the next += become detached — setting their innerHTML later has no visible effect. ✅ Fix: build all HTML in one pass (accumulate cardsHtml string), append once with result.innerHTML += cardsHtml, THEN iterate dimensions to populate each card.
🔴 Ternary (fn(...), "") in assignment clears innerHTML: When a rendering function like renderDimSlot() sets slot.innerHTML internally, wrapping it in a ternary produces a value that gets assigned to slot.innerHTML. E.g. slot.innerHTML = dim.startsWith("group-") ? (renderDimSlot(dim, data, slot), "") : ... — the comma operator evaluates renderDimSlot() then returns "", which is assigned to slot.innerHTML, clearing the content just set. ✅ Fix: use if/else branches instead of a ternary for assignment — call renderDimSlot() standalone without assigning its return to innerHTML.
🔴 chat_type empty in wx sessions: wx sessions may return entries without chat_type. The analyze_summary() sessions loop fetches messages for these but then silently skips them (falls through both if chat_type == "private" and elif chat_type == "group"). Fix: after enrichment, infer chat_type from sender count (>1 distinct non-empty senders = group, else if name in known_groups = group, else private). This is applied BEFORE the if/elif block.
🔴 /api/summary traceback exposure: The error handler returned traceback.format_exc() to the client. Fixed: only print_exc() to console, return just str(e) in the response.
🔴 LLM endpoint 假阳性（测试连接永远"✅ 成功"）: 所有 /api/llm/* 端点原来都是 return jsonify({"ok": True, "data": result})，不管 LLM 返回 {"error": "..."}。修复：server.py 加了 _check_llm_result(result) helper，检测 result is None 或包含 "error" 键时返回 {"ok": False}。所有 12+ 个 endpoint 都已改。新增 endpoint 记得用这个 helper。
🔴 API Key 在 config.json 中可能被截断/损坏: 用户通过设置面板输入 key 时可能被截断（表现为中间出现字面量 ..... 或错误字符）。症状：LLM 调用返回 401 auth error。修复：直接编辑 config.json 写入正确 key，然后重启服务。
🔴 extract_todos_ai prompt 优先级: 待办提取默认无优先级顺序。用户要求调整为：优先提取「我」（使用者）的任务放最前面，其次「对方」（联系人）的待办作为提醒，再「双方」的约定。在 llm.py 第 604 行的 system prompt 中加了排序指令+三级优先级说明。who 字段值改为 我/对方/双方（"我"排首位）。
🔴 list_contacts() 探针循环无视 _run_wx 异常: 探测联系人历史时调用 _run_wx(f'wx history {_q(name)} -n 1 --json')，如果 wx-cli 找不到该联系人会抛 RuntimeError，直接炸穿整个 /api/contacts 接口。修复：探针代码加 try/except，联系人查不到就跳过。同时所有取 name 的地方（sessions 循环、contacts 补充循环、探针循环）都加了 .strip() 清除前导/尾部空白，避免 " 河马,🐸" 这种带空白的名字导致查询失败。
🔴 list_contacts() 需要过滤 raw ID 而非直接展示: wx sessions 和 wx contacts 返回的 chat/display 字段可能包含企业微信 (@qy_g/@qy_u)、OpenIM (@openim)、系统硬编码 (@hardcode)、纯数字无名群聊 (45287246329@chatroom)、企业微信 raw ID (ww197...) 等非用户可读的 ID。修复：list_contacts() 内部定义 _raw_id() 函数检测这些模式，在 3 个循环（sessions、contacts 补充、探针）中都应用过滤。张东民@养老管家 这类含 @ 的真昵称不受影响（中文开头）。
🔴 Identity picker overrides identity_key, not just label: The old editIdentityTag() used prompt() and only set a custom label — the auto-detected identity_key was kept, so analysis dimensions (index, signals, AI) didn't change. The new showIdentityPicker() (2026-05-26) opens a modal with 6 predefined identities (lover/family/business/colleague/service/friend) plus a "新建身份" option. Selecting an identity saves {identity_key, label} to custom_tags.json and re-runs analysis. In analyze_private(), if custom_tag has identity_key, it overrides identity["identity"] — switching to that identity's index computation, signal keywords, and AI dimension prompts. Custom identities default to friend analysis base. See references/identity-picker.md.
🔴 Custom tag storage format changed: custom_tags.json now stores {"contact": {"identity_key": "lover", "label": "恋人"}} instead of {"contact": "恋人"}. Backward compat: _load_custom_tag() detects old string format via isinstance(raw, str) and converts. When setting a tag via /api/tags POST, pass {contact, tag, identity_key} — if identity_key is omitted, only the label is overridden (legacy mode). DELETE still works the same way (removes the entry entirely).
🔴 analyze_private() / analyze_group() 必须 catch _run_wx 异常: _run_wx() 在 wx-cli 返回非零退出码且 stdout 为空时抛 RuntimeError（如 "找不到常燕的消息记录"）。没有 try/except 的话，异常传到 api_analyze() 返回 raw 错误给用户。修复：两个函数都把 wx history 调用包在 try/except 里，返回 {"error": f"无法读取... {e}"}。详见 references/identity-picker.md。
🔴 wx contacts returns 20,000+ entries including non-friend group members: list_contacts() must NEVER use wx contacts as the primary source for private (friend) contacts. wx contacts queries the entire local WeChat contact DB, which includes every group member you've ever seen (19,000+ non-friends). Use wx sessions -n 9999 --json instead — sessions with chat_type=private (~388 entries) are the authoritative friend list (you can't have private chats with non-friends in WeChat). For groups, supplement sessions results with wx contacts --query filtered to @chatroom entries only. This keeps total results at ~650 (not 20,000+) and ensures group members who aren't friends never appear as private chat options.
🔴 Canvas animation NaN kills rAF silently: Math.pow(negative, nonIntegerExponent) returns NaN. When this propagates into Canvas method arguments (e.g. createRadialGradient(..., NaN)), it throws an uncaught error that silently terminates the requestAnimationFrame loop. Always clamp bases before fractional exponents: Math.max(0, 1 - progress). See references/animated-background.md and references/intro-animation.md.
🔴 Silent analysis failure triad: When runAnalysis() fetch succeeds but results stay empty with no error shown, check three signals: spinner.classList.contains('active') (false=done), results.innerHTML.length (0=empty), errorBox.classList.contains('active') (false=no error). This means renderResults() threw uncaught — wrap it in the same try-catch as the fetch. See references/bug-fix-reference.md.
🔴 API key round-trip truncation (three-part bug): (1) Server masked API key on GET via get_config() — frontend displayed sk-e36...9c2c in the input field. (2) User clicked save without re-entering the full key. (3) Frontend sent the masked value from the input as the API key. The POST handler correctly returned the raw key from _load_config(), but the frontend's newKey || d.config.api_key branch used the input value (masked) over the response (raw). Fix: get_config() now returns the raw (unmasked) config. The input field shows the full key, so any save round-trip preserves it. Also: frontend should detect ... in the entered key and fall back to the server response. See references/bug-fix-reference.md.

Additional Reference Files

The following reference files from consolidated sibling skills provide deeper detail on specific topics:

references/animated-background.md — Persistent canvas ambient background (6-blob wander steering system)
references/architecture.md — System architecture overview (project structure, API routes, data flow)
references/bug-fix-reference.md — Specific bug root causes and fixes (DOM destruction, API key, friend score)
references/bug-fixes-and-patterns.md — IME, pinyin, screenshot, Chart.js, contacts data source patterns
references/desktop-packaging.md — PyInstaller + Tauri v2 native app packaging pipeline
references/github-pages-setup.md — GitHub Pages 展示页部署流程与风格对齐规则
references/intro-animation.md — Canvas-based intro animation design spec (particles, colors, timing)
references/liquid-glass-styling.md — Glass morphism CSS design system (tokens, components, blur hierarchy)
references/performance-benchmarks.md — Performance test results and optimization history
references/quick-nav-redesign.md — Floating navigation widget design history
references/skill-hub-publishing.md — Skill Hub 发布前检查清单（个人信息清理、合并、安装说明）

LLM Usage Tracking

All LLM calls go through chat() in llm.py which tracks usage automatically.

Architecture

chat(system_prompt, user_prompt, ..., caller="信号分析")
  ├─ Sends request to configured LLM provider
  ├─ Reads usage.prompt_tokens / usage.completion_tokens from API response
  ├─ _track_usage(caller, in_tok, out_tok)
  │   ├─ Increments total_calls, total_input/output_tokens
  │   ├─ Estimates cost: $0.28/M input + $1.10/M output (deepseek-v4-flash)
  │   ├─ Updates by_function[caller] breakdown
  │   └─ Appends to calls log (last 100 entries)
  └─ _save_usage() → usage_stats.json

GET /api/usage  → get_usage_stats() → returns full usage dict
POST /api/usage → reset_usage_stats() → clears everything

Caller Names

All 14 chat() call sites are tagged with human-readable caller names:

Caller	Function	Type
--------	----------	------
信号分析	analyze_signal	private
回复建议	suggest_reply	private
关系洞察	analyze_chat_insight	private
阶段洞察	generate_phase_insight	private
话题分析	analyze_topics_ai	private
趋势追踪	extract_todos_ai (topics)	private
待办提取	extract_todos_ai (todos)	private
情绪追踪	track_emotion_ai	private
群话题榜	group_topic_leaderboard	group
群成员画像	group_member_profile	group
群氛围评估	group_vibe_check	group
群信号雷达	group_signal_radar	group
群内足迹	group_my_trace	group
群角色地图	group_role_map	group

Frontend Panel

"📊 用量统计" tab in the ⚙️ settings modal:

4 overview cards: total calls, total tokens, input tokens, estimated cost
Per-function breakdown table: name, calls, input/output tokens, percentage
Refresh + Reset buttons
Auto-loads when tab is selected
Server restart required for host binding change

Hallmark audit results and CSS fixes: references/hallmark-audit.md.

Token Counting

API uses DeepSeek-specific usage.prompt_tokens and usage.completion_tokens from the response. If unavailable (older providers), falls back to character-based estimate: 1 token ≈ 3 chars for Chinese text.

Persistence

Stats survive restarts — written to usage_stats.json after every LLM call. File is created on first tracked call. Reset clears the file to empty defaults.

Full dimension tables, phase labels, index titles, and signal keywords are in references/identity-dimensions.md. Debugging recipes for identity scoring bugs and chat_type inference are in references/identity-debugging.md. Extension patterns for adding new AI dimensions, identity types, and summary fields are in references/extension-patterns.md. Performance optimization techniques (parallel wx, caching, batch analysis) are in references/performance-optimization.md. Frontend pitfalls (DOM destruction, ternary overwrite, html2canvas, IME, pinyin) are in references/frontend-pitfalls.md.

Six auto-detected identities with scoring rules:

Identity	Key signals	Dedicated index
----------	------------	-----------------
lover 💕	亲昵词密度, 深夜%, 日均>20, 多模态>8%	intimacy (7-dim weighted)
family 👨‍👩‍👧	家庭词>10, 关怀>亲昵×2, 通话>5%	family_index (关怀/频率/通话)
business 💼	商业词>5, 链接>20%, 正式语气	biz_index (响应/专业度/时长)
colleague 🏢	工作词>5, 工作日>70%, 办公时段>50%	colleague_index (工作时段/密度/回应)
service 📞	服务词>3, 对方发起>60%	service_index (响应/解决/礼貌/关怀)
friend 🤝	兜底，轻松短句	friend_index (频率/多样/轻松/回应)

Priority: lover wins at ≥40 score unless another identity exceeds by +25.

Identity-Aware Phase Labels

_detect_phases(enriched, user_name, identity) uses the same volume-trend algorithm for all identities but different labels per identity type. Defined in IDENTITY_PHASES dict (~80 lines). The algorithm: compare recent-3-months vs earlier average → escalation/growth/cooling/stable/building/initial.

Phase	lover	friend	colleague	family	business	service
-------	-------	--------	-----------	--------	----------	---------
escalation	爆发期🔥	热络期🔥	紧密协作⚡	亲密期💕	深度合作🔥	高频互动🔥
growth	上升期📈	升温期📈	协作增多📈	回暖期📈	合作推进📈	服务增多📈
stable	稳定期💚	老铁🤝	稳定协作🤝	稳定联系🤝	稳定合作🤝	稳定服务✓
cooling	冷淡期❄️	疏远期🌥️	沟通减少📉	疏远期🌥️	合作放缓📉	减少咨询📉
building	构建期🌱	发展期🌱	建立默契🌱	重建联系🌱	建立关系🌱	建立信任🌱
initial	初识期🌱	点头之交👋	初识🏢	疏于联系📞	初步接触💼	新客户🆕

Call order in analyze_private(): detect identity first (or use placeholder "lover"), then re-detect phases with correct identity key. Both branches (full_enriched and else) follow the same pattern.

Identity-Aware AI Insight Prompts

analyze_chat_insight(messages, chat_name, identity) in llm.py uses a systematic

6-dimension framework (3 universal + 3 identity-specific) per identity, defined in

the DIMS dict. This replaced the old ad-hoc PROMPTS dict.

Universal dimensions (all identities, semantics adapted):

互动节奏 — frequency, response speed, time patterns
关系温度 — warmth, closeness, emotional tone
发展趋势 — trajectory, growth/decline signals

Identity-specific dimensions:

Identity	Dim 4	Dim 5	Dim 6
----------	-------	-------	-------
💕 lover	情感深度 — direct vs indirect expression, emotional resonance	权力动态 — who initiates, who leads, who compromises	未来信号 — commitment hints, shared plans
🤝 friend	兴趣共鸣 — shared interests, recommendations	互惠平衡 — give/take ratio, help symmetry	圈子融合 — mutual friends, group interactions
🏢 colleague	信息同步 — clarity, missed messages, confirmations	边界感 — after-hours comms, work/personal separation	依赖模式 — one-way asks vs mutual collaboration
👨‍👩‍👧 family	责任分担 — chores/care/childcare topics	代际动态 — elder/peer/younger interaction patterns	生活参与 — daily life sharing, major decision consultation
💼 business	利益对齐 — win-win vs zero-sum signals	专业匹配 — capability complement, resource fit	风险评估 — breach/friction signals, uncertainty
📞 service	问题解决 — first-contact resolution, repeat issues	主动服务 — reminders, check-ins, proactive care	客户粘性 — recommendation intent, loyalty signals

Pitfall: when adding a new identity, verify no specific dimension overlaps with a

universal dimension. Family originally had "情感表达" as dim 6 which overlapped

with universal "关系温度" (both measured care/warmth). Fixed by changing to "生活参与".

LLM prompt format: lists all 6 dims, asks for 1-2 sentences per dim + 2-3 actionable

suggestions. Output uses • bullets for analysis and 💡 for suggestions.

Identity-Specific Signal Dictionaries

IDENTITY_SIGNAL_KEYWORDS in analyzer.py defines 5 signal dimensions per

non-lover identity. Each signal has a keyword list + icon + color.

Identity	Signal 1	Signal 2	Signal 3	Signal 4	Signal 5
----------	----------	----------	----------	----------	----------
friend	轻松😄	分享📤	吐槽😤	关心🤗	邀约📅
colleague	专业📋	同步🔄	协作🤝	效率⚡	反馈💬
family	关心🤗	生活🏠	经济💰	叮嘱📢	团聚👨‍👩‍👧
business	专业📋	效率⚡	信任🤝	推进📈	风险⚠️
service	效率⚡	礼貌🙏	解决✅	投诉😟	满意👍

Lover uses the default _analyze_signals() with AFFECTION_WORDS/CARE_WORDS etc.

_compute_identity_signals(enriched, identity) returns same format as

_analyze_signals: {"signals": [...], "summary": "..."}. Returns None for

lover/unknown (caller falls back to _analyze_signals).

In analyze_private(), after identity detection, if identity != lover:

intimacy is replaced with the identity-specific index result
signals is replaced via _compute_identity_signals()

Frontend: the index card title is dynamic via INDEX_TITLES[idKey] map in

renderPrivateAnalysis() — shows "友谊指数"/"同事指数"/"协作指数" etc.

instead of hardcoded "亲密度指数".

微信聊天分析器

概述

微信聊天分析器 (wechat-analyzer)

前置条件

1. 系统与微信

2. 安装 wx-cli

3. macOS 初始化（只需一次）

4. 验证 wx-cli

5. 配置 config.json

6. Python 环境

7. 验证就绪

wx-cli 工具参考

核心命令

数据拉取反模式

JSON 格式说明

实时聊天 Coaching 规则

关系诊断维度

Daemon 故障排查

Quick Start

Architecture

Removed features (2026-05-23)

AI Analysis Cards (redesigned 2026-05-23)

Private Chat Card Layout

Loading Animation

Group Chat (6 dims — JSON-structured, graphically rendered)

Group Panel Layout (no tab buttons)

+ .

runLLMDim() — single-dimension manual re-query

renderDimSlot() — shared rendering function

Adding a new AI dimension

_sample_convo() helper (llm.py)

_format_private_convo() — USE FOR ALL PRIVATE CHAT AI PROMPTS (llm.py)

Settings Page

Nickname Flow Through the System

API Contract: username field

Why this matters

Data flow

Adding username to a new feature

微信聊天分析器

Additional Reference Files

LLM Usage Tracking

Architecture

Caller Names

Frontend Panel

Token Counting

Persistence

Identity-Aware Phase Labels

Identity-Aware AI Insight Prompts

Identity-Specific Signal Dictionaries

版本历史

安全检测

腾讯云安全 (Keen)

腾讯云安全 (Sanbu)

🔗 相关推荐

ceoskil

求职全流程教练

PageSpeed Insights SEO 优化

+
.

`runLLMDim()` — single-dimension manual re-query

`renderDimSlot()` — shared rendering function

`_sample_convo()` helper (llm.py)

`_format_private_convo()` — USE FOR ALL PRIVATE CHAT AI PROMPTS (llm.py)