← 返回
开发者工具 中文

Scrapling Web Scraping

Zero-bot-detection web scraping for OpenClaw. Bypass Cloudflare, handle JavaScript-heavy sites, and adapt to website changes automatically. Use when you need...
面向OpenClaw的零检测网页抓取。可绕过Cloudflare,处理重度JS网站,并自动适应网站变化。适用于需要...
zhengxinjipai
开发者工具 clawhub v1.0.0 1 版本 99453.9 Key: 无需
★ 3
Stars
📥 4,675
下载
💾 48
安装
1
版本
#latest

概述

Scrapling Web Scraping

Zero-bot-detection web scraping for OpenClaw. Bypass Cloudflare, handle JavaScript-heavy sites, and adapt to website changes automatically.

Quick Start

# Install Scrapling
pip install "scrapling[all]"
scrapling install

# Basic usage
python3 /root/.openclaw/skills/scrapling-web-scraping/scrapling_tool.py https://example.com

# Bypass Cloudflare
python3 /root/.openclaw/skills/scrapling-web-scraping/scrapling_tool.py https://protected-site.com --mode stealth --cloudflare

# Extract specific data
python3 /root/.openclaw/skills/scrapling-web-scraping/scrapling_tool.py https://example.com --selector ".product-title"

# JavaScript-heavy sites
python3 /root/.openclaw/skills/scrapling-web-scraping/scrapling_tool.py https://spa-app.com --mode dynamic --wait ".content-loaded"

Usage with OpenClaw

Natural Language Commands

Basic scraping:

> "用Scrapling抓取 https://example.com 的标题和所有链接"

Bypass protection:

> "用隐身模式抓取 https://protected-site.com,绕过Cloudflare"

Extract data:

> "抓取 https://shop.com 的商品名称和价格,CSS选择器是 .product"

Dynamic content:

> "抓取 https://spa-app.com,等待 .data-loaded 元素加载完成"

Python Code

# Basic scraping
from scrapling.fetchers import Fetcher
page = Fetcher.get('https://example.com')
title = page.css('title::text').get()

# Bypass Cloudflare
from scrapling.fetchers import StealthyFetcher
page = StealthyFetcher.fetch('https://protected.com', 
                              headless=True, 
                              solve_cloudflare=True)

# JavaScript sites
from scrapling.fetchers import DynamicFetcher
page = DynamicFetcher.fetch('https://spa-app.com', 
                             headless=True, 
                             network_idle=True)

Features

FeatureCommandDescription
-------------------------------
Basic Scrape--mode basicFast HTTP requests
Stealth Mode--mode stealthBypass Cloudflare/anti-bot
Dynamic Mode--mode dynamicHandle JavaScript sites
CSS Selectors--selector ".class"Extract specific elements
JSON Output--jsonMachine-readable output

Examples

1. Scrape with CSS Selector

python3 scrapling_tool.py https://quotes.toscrape.com --selector ".quote .text" --json

2. Bypass Cloudflare

python3 scrapling_tool.py https://nopecha.com/demo/cloudflare --mode stealth --cloudflare

3. Wait for Dynamic Content

python3 scrapling_tool.py https://spa-app.com --mode dynamic --wait ".loaded" --json

CLI Reference

python3 scrapling_tool.py URL [options]

Options:
  --mode {basic,stealth,dynamic}  Scraping mode (default: basic)
  --selector, -s CSS_SELECTOR     Extract specific elements
  --cloudflare                    Solve Cloudflare (stealth mode only)
  --wait SELECTOR                 Wait for element (dynamic mode only)
  --json, -j                      Output as JSON

Advanced: Custom Scripts

Create custom scraping scripts in /root/.openclaw/skills/scrapling-web-scraping/:

from scrapling.fetchers import StealthyFetcher

# Your custom scraper
def scrape_products(url):
    page = StealthyFetcher.fetch(url, headless=True)
    products = []
    for item in page.css('.product'):
        products.append({
            'name': item.css('.name::text').get(),
            'price': item.css('.price::text').get(),
            'link': item.css('a::attr(href)').get()
        })
    return products

Notes

  • Requires Python 3.10+
  • First run: scrapling install to download browsers
  • Respect website Terms of Service
  • Use responsibly

Created: 2026-03-05 by 老二

Source: https://github.com/D4Vinci/Scrapling

版本历史

共 1 个版本

  • v1.0.0 当前
    2026-03-29 17:01 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

developer-tools

CodeConductor.ai

larsonreever
AI驱动平台,提供快速全栈开发、智能体、工作流自动化及低代码AI集成的可扩展产品创建。
★ 65 📥 179,827
ai-intelligence

Self Improving Agent CN

zhengxinjipai
AI自我改进与记忆系统 - 解决'同类错误反复犯、用户纠正不长记性'的痛点。自动捕获错误、用户纠正、最佳实践,并转化为长期记忆。
★ 41 📥 33,893
developer-tools

Gog

steipete
Google Workspace 命令行工具,支持 Gmail、日历、云端硬盘、通讯录、表格和文档。
★ 920 📥 185,724