概述

WeChat Article Reader

Extract full article content from mp.weixin.qq.com URLs.

When to Use

User shares a WeChat article link (mp.weixin.qq.com/s/xxx)
Need to read/summarize/analyze/archive a WeChat article
ContentPipe Scout node receives a WeChat URL for reference

Quick Start

# First-time setup (installs headless Chromium ~200MB)
python3 SKILL_DIR/scripts/setup.py

# Extract article
python3 SKILL_DIR/scripts/fetch_article.py "https://mp.weixin.qq.com/s/xxx"

# Output: JSON with title, author, publish_time, content, word_count

How It Works

WeChat articles are JS-rendered — HTTP requests only get an empty shell. This skill uses Playwright headless Chromium to:

Launch headless browser with anti-detection flags
Navigate to the WeChat URL, wait for networkidle
Wait for #js_content (article body container)
Extract title (h1#activity-name), author, time, body text
Clean HTML → plain text (strip scripts/styles, compress whitespace)
Return structured JSON

Fallback: Mirror Search

If Playwright is unavailable, the skill searches Chinese content aggregators (53ai.com, 36kr.com, juejin.cn, woshipm.com) for mirror copies of the article.

Python API

from fetch_article import fetch_wechat_article

result = fetch_wechat_article("https://mp.weixin.qq.com/s/xxx")
# result = {
#   "success": True,
#   "title": "文章标题",
#   "author": "作者名",
#   "publish_time": "2026-03-10",
#   "content": "正文全文...",
#   "word_count": 2500,
#   "source": "playwright",  # or "mirror"
#   "url": "https://mp.weixin.qq.com/s/xxx"
# }

Limitations

Requires one-time Chromium install (python3 scripts/setup.py)
First fetch takes ~5-10s (browser startup); subsequent fetches ~3-5s (browser reuse)
Cannot bypass WeChat login walls (paid content, follower-only articles)
Mirror fallback only works for popular/widely-shared articles

版本历史

共 1 个版本

v1.0.0 当前

2026-03-31 03:01 安全安全

安全检测

腾讯云安全 (Keen)

安全，无风险

查看报告

腾讯云安全 (Sanbu)