← 返回
未分类

PKU Info Spider

WeChat Official Account article crawler (微信公众号爬虫) CLI tool built in Rust. Use this skill when working on the info-spider crate, debugging spider commands, ad...
微信公众号文章爬虫 CLI 工具,使用 Rust 编写。适用于 info‑spider crate 的开发、调试爬虫命令等场景。
wjsoj
未分类 clawhub v1.0.0 1 版本 100000 Key: 无需
★ 0
Stars
📥 426
下载
💾 0
安装
1
版本
#crawler#latest#pku#rust#wechat

概述

Info-Spider - 微信公众号爬虫 CLI

A CLI crawler for WeChat Official Account (公众号) articles via the MP backend.

Architecture

  • Crate location: crates/info-spider/
  • Auth flow: WeChat QR code login (completely separate from IAAA, does NOT use info-common)
  • API: mp.weixin.qq.com backend API
  • Config: ~/.config/info-spider/ (separate from info-common Store)
  • Flow docs: docs/wechat-mp-flow.md

Key Source Files

  • src/main.rs — Entry point
  • src/cli.rs — Clap CLI definition
  • src/commands.rs — Command implementations
  • src/api.rs — WeChat MP API client
  • src/session.rs — Own session persistence (token, fingerprint, bizuin)
  • src/client.rs — reqwest client builders

CLI Commands

CommandFunction
-------------------
loginWeChat QR code scan login to mp.weixin.qq.com
logout / statusSession management
search Find Official Accounts by name/ID (returns fakeid list)
articlesFetch articles from an OA (--name or --fakeid)
scrape Convert single article URL to Markdown

Articles Command Options

  • --begin — Start offset for pagination
  • --count — Articles per page
  • --limit — Maximum total articles to fetch
  • --delay-ms — Random delay between requests (anti-crawler)
  • --format {table|json|jsonl} — Output format

Development Notes

  • Standalone auth: Uses its own WeChat QR login, NOT the IAAA flow from info-common
  • Own session.rs: Stores token, fingerprint, bizuin (different from info-common session format)
  • Mimics real user behavior with configurable delays to bypass risk controls
  • Article scraping extracts content to clean Markdown
  • Multiple output formats: table (default), JSON, JSONL
  • All user-facing strings in Chinese
  • Error handling: anyhow::Result with .context("中文描述")

版本历史

共 1 个版本

  • v1.0.0 当前
    2026-05-03 08:55 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

PKU Campus Card

wjsoj
PKU校园卡 CLI 工具,使用 Rust 编写。适用于处理 campuscard 工具库、调试校园卡命令、添加功能等场景。
★ 0 📥 438

PKU Info Auth

wjsoj
PKU 统一凭据管理 CLI(统一凭据管理)。当用户或 AI Agent 需要对 PKU服务进行身份验证或管理已存储的凭据时使用此技能。
★ 0 📥 431

Treehole

wjsoj
PKU树洞(北大树洞)是一款用Rust编写的匿名论坛CLI工具。在使用treehole crate、调试treehole命令、添加功能等场景下使用此技能。
★ 0 📥 419