抓取 x.com 用户推文,生成中英双语日报,发布到飞书云文档,并将链接推送到微信。
完整流程 / Full Workflow:
支持两种时间范围模式:
TIME_START / TIME_END 指定任意起止时间(ISO 8601 格式)当用户提出以下请求时使用本技能:
The scraping script requires a logged-in Chrome instance with DevTools Protocol
enabled. The user's normal Chrome cannot be reused directly (sandbox
restrictions); a temporary profile must be created.
```bash
pkill -9 -f "Google Chrome"
```
profile — it is tens of GB and will hang):
```bash
rm -rf /tmp/chrome-debug-profile
mkdir -p /tmp/chrome-debug-profile/Default
for f in "Cookies" "Cookies-journal" "Login Data" "Login Data-journal" \
"Network" "Preferences" "Web Data" "Web Data-journal"; do
src="$HOME/Library/Application Support/Google/Chrome/Default/$f"
[ -e "$src" ] && cp -r "$src" /tmp/chrome-debug-profile/Default/ 2>/dev/null
done
# Also copy top-level files
for f in "Local State" "Last Version"; do
src="$HOME/Library/Application Support/Google/Chrome/$f"
[ -e "$src" ] && cp "$src" /tmp/chrome-debug-profile/ 2>/dev/null
done
```
```bash
nohup /Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome \
--remote-debugging-port=9222 \
--user-data-dir=/tmp/chrome-debug-profile \
--no-first-run \
--no-default-browser-check \
> /tmp/chrome-cdp.log 2>&1 &
sleep 6
```
```bash
curl -s --connect-timeout 5 http://127.0.0.1:9222/json/version
```
If this returns JSON with a Browser field, Chrome is ready.
All of the above MUST use dangerouslyDisableSandbox: true — macOS sandbox
blocks process management of system Chrome otherwise.
Run the bundled scraping script.
Default time range (前一天 00:00 ~ 23:59 北京时间):
cd <workspace> && \
TARGET_USERNAME="DeItaone" \
OUTPUT_DIR="<workspace>" \
NODE_OPTIONS="" \
NODE_PATH=<workspace_node_modules> \
<node_path> <skill_dir>/scripts/scrape_tweets.js
Custom time range (自定义时间范围):
cd <workspace> && \
TARGET_USERNAME="DeItaone" \
TIME_START="2026-05-24T09:00:00+08:00" \
TIME_END="2026-05-26T09:00:00+08:00" \
OUTPUT_DIR="<workspace>" \
NODE_OPTIONS="" \
NODE_PATH=<workspace_node_modules> \
<node_path> <skill_dir>/scripts/scrape_tweets.js
Environment variables / 环境变量:
| 变量 / Variable | 说明 / Description | 默认值 / Default |
|---|---|---|
| --- | --- | --- |
TARGET_USERNAME | x.com 用户名(不含 @) | DeItaone |
TIME_START | 时间窗口起点(ISO 8601),如 2026-05-24T09:00:00+08:00 | 自动计算(前一天 00:00 北京时间) |
TIME_END | 时间窗口终点(ISO 8601),如 2026-05-26T09:00:00+08:00 | 自动计算(前一天 23:59 北京时间) |
OUTPUT_DIR | tweets_raw.json 输出目录 | 当前工作目录 |
CDP_URL | Chrome DevTools Protocol 地址 | http://127.0.0.1:9222 |
时间格式说明 / Time Format:
2026-05-24T09:00:00+08:00(北京时间)、2026-05-24T01:00:00Z(UTC)The script:
If the script reports "Not logged in", the temp profile did not retain the
session. In that case, ask the user to log into x.com in the debug Chrome
window and re-run.
tweets_raw.json and translate each tweet into Chinese. Keep ticker tags ($NVDA, $TSLA) in the translation. Use financial-news terminology.
translations.json:```json
{
"ENGLISH PREFIX TEXT...": "中文翻译...",
...
}
```
Use the first 60–80 characters of each English tweet as the key.
```bash
TRANSLATIONS_PATH=
python3
--title "Title 日报" \
--author @username
```
references/feishu_format.md. It matches translations by longest prefix
match and inserts placeholders for any unmatched tweets.
(翻译待补充) placeholders. For anyremaining, manually add the translations by editing the markdown file.
Use lark-cli to create a Feishu cloud document from the markdown:
LARK_CLI="<path-to-lark-cli>"
NODE_OPTIONS="" "$LARK_CLI" docs +create \
--api-version v2 \
--as user \
--doc-format markdown \
--content "@<path-to-report.md>" \
--title "Title 日报"
--api-version v2 and --doc-format markdown.@ prefix on --content signals a file path.in Phase 5 to send to WeChat. Also display it to the user.
After the Feishu document is created, send the link to the user's WeChat via the
WorkBuddy Mini Program (微信小程序) using deliver_attachments.
```bash
cat >
# 📊 推文日报已生成
飞书文档链接 / Feishu Doc Link:
博主 / Author: @
时间范围 / Time Range:
推文数量 / Tweet Count:
---
点击上方链接查看完整日报 / Click the link above to view the full report.
EOF
```
deliver_attachments to push the summary to WeChat:```
deliver_attachments({
attachments: ["
explanation: "推送飞书日报链接到微信小程序"
})
```
> Note / 注意: This requires the user to have the "产物回传到小程序"
> (Deliver Artifacts to Mini Program) toggle enabled in WorkBuddy Mini Program
> connection settings.
pkill -f "chrome-debug-profile"tweets_raw.json, translations.json, report.md, and feishu_link.md files in the workspace are intermediate artifacts.
Keep them for traceability.
| Problem | Cause | Fix |
|---|---|---|
| --------- | ------- | ----- |
| CDP connection refused | Chrome not running with --remote-debugging-port | Re-launch Chrome per Phase 1 |
| "Not logged in" in script output | Temp profile missing cookies | Ask user to log in via the debug Chrome window |
| Full profile copy hangs | Chrome profile is 10–50 GB | Only copy the files listed in Phase 1 step 2 |
| xcancel.com pagination blocked | Anti-bot verification | Never use xcancel/Nitter — always use x.com via CDP |
lark-cli auth expired | Token TTL | Re-run — the CLI auto-refreshes |
scripts/scrape_tweets.js — Playwright CDP scraper for x.com timelinesscripts/format_for_feishu.py — Generates bilingual markdown from raw tweetsreferences/feishu_format.md — Document structure and Feishu CLI reference共 1 个版本