← 返回
未分类 中文

Web scraping skill using Chrome + WebMCP

Web scraping using Chrome + WebMCP. Primary method for all web crawling tasks.
使用 Chrome + WebMCP 进行网页抓取,是所有网页爬虫任务的主要方法。
sweihub sweihub 来源
未分类 clawhub v1.0.0 1 版本 99845.6 Key: 无需
★ 0
Stars
📥 1,293
下载
💾 5
安装
1
版本
#latest

概述

Spider — Web Scraping Tool

This is the default web scraping method, replacing older approaches like web_fetch.


Trigger Conditions

Use this skill when user says:

KeywordsAction
------------------
抓取 / crawl / scrape / fetchUse Chrome + WebMCP to scrape web pages
采集Same as above
获取...新闻Scrape news pages
从...网站Specify website to scrape
同花顺Scrape Tonghuashun (10jqka) data
东方财富Scrape East Money data
雪球Scrape Xueqiu data
百度Search or scrape Baidu content

Usage Examples

User InputExecution
-----------------------
"抓取光库科技的新闻"Open Tonghuashun in Chrome, extract news
"抓取宁德时代的股吧"Open East Money guba in Chrome
"从同花顺抓取xxx"Open Tonghuashun page in Chrome
"search xxx"Open Google search in Chrome
"查一下xxx"Search or scrape in Chrome

Operation Flow

1. Check Chrome Status

{ action: "status" }

If not running, start it:

{ action: "start" }

2. Open Target Page

{ action: "open", targetUrl: "https://stockpage.10jqka.com.cn/300620/news/", target: "host" }

3. Get Page Snapshot

{ action: "snapshot", targetId: "xxx", maxChars: 20000 }

4. Page Interaction (click, type, etc.)

{ action: "act", targetId: "xxx", request: {"kind": "click", ref: "e33"} }

5. Cleanup: Return to about:blank

{ action: "navigate", targetId: "xxx", url: "about:blank" }

Common Website Templates

Tonghuashun Stock News

URL: https://stockpage.10jqka.com.cn/{stock_code}/news/
Example: https://stockpage.10jqka.com.cn/300620/news/

East Money Guba (Stock Forum)

URL: https://guba.eastmoney.com/list,{stock_code}.html
Example: https://guba.eastmoney.com/list,300620.html

Xueqiu (Snowball)

URL: https://xueqiu.com/S/SZ{stock_code}
Example: https://xueqiu.com/S/SZ300620

Baidu News Search

URL: https://www.baidu.com/s?wd={keyword}&tn=news

Chrome Setup (One-time)

  1. Open Chrome Flags:
    • chrome://flags/#enable-experimental-web-platform-features → Enabled
    • chrome://flags/#enable-webmcp-testing → Enabled
  2. Fully quit Chrome (Cmd+Q) and restart

Important Rules

  1. Use target="host" instead of "sandbox"
  2. Must cleanup after each task:
    • If multiple tabs exist, keep only one, close others
    • The remaining tab must navigate to about:blank
    • If multiple about:blank tabs exist, keep only the latest one, close others
    • Use browser action: tabs to check current tab status
    • After cleanup, ensure only one about:blank tab remains
  3. Reuse existing tabs, avoid opening new tabs frequently
  4. Handle anti-scraping sites: Tonghuashun, East Money need complete JavaScript loading

Error Handling

ErrorSolution
-----------------
Sandbox unavailableUse target="host"
Slow page loadWait for snapshot to return before操作
Content extraction failedUse snapshot's maxChars to get more content
Anti-scraping blockedTry other finance sites or wait and retry

Default Scraping Priority

  1. Spider (Chrome + WebMCP) ← Primary method
    • Suitable for: Finance websites, stock news, forums
    • Advantages: Full JavaScript rendering, interactive
  1. web_fetch ← Backup method
    • Suitable for: Simple static pages
    • Disadvantage: Cannot handle JavaScript-rendered pages

版本历史

共 1 个版本

  • v1.0.0 当前
    2026-04-30 16:54 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

data-analysis

Stock Watcher

robin797860
管理和监控个人股票自选列表,支持利用同花顺数据添加、删除、列出股票及汇总近期表现。适用于用户希望追踪特定股票、获取表现汇总或管理自选列表时。
★ 112 📥 46,184
data-analysis

Data Analysis

ivangdavila
{"answer":"数据分析与可视化。查询数据库、生成报告、自动化电子表格,将原始数据转化为清晰可行的见解。适用于:(1) 您……"}
★ 208 📥 68,623
data-analysis

Tavily 搜索

jacky1n7
通过 Tavily API 进行网页搜索(Brave 替代方案)。当用户要求搜索网页、查找来源或链接,且 Brave 网页搜索不可用时使用。
★ 273 📥 100,344