← 返回
未分类 中文

Product Pricing Scraper

Scrape and compare product prices across e-commerce sites. Use when the user needs to extract pricing data, track price changes, compare prices across retail...
抓取并比较电商平台商品价格。适用于提取价格数据、追踪价格变化、跨零售商比价等场景。
terrycarter1985 terrycarter1985 来源
未分类 clawhub v1.1.0 1 版本 100000 Key: 无需
★ 0
Stars
📥 367
下载
💾 0
安装
1
版本
#latest

概述

Product Pricing Scraper

Scrape, normalize, and compare product prices across e-commerce sites.

Quick Start

"Compare prices for [product name] across Amazon, eBay, and Walmart"
"Scrape pricing from [URL]"
"Track price changes for [product]"

Workflow

  1. Identify target — accept product name (search across sites) or specific URL(s)
  2. Scrape — use scripts/scrape_prices.py for each target
  3. Normalize — strip currency symbols, unify units, convert currencies if needed
  4. Compare — rank by price, flag outliers, compute average/median
  5. Output — JSON or CSV via --format flag

Scraping Strategies

Static pages (most product listings)

Use web_fetch + cheerio-style parsing. No browser needed.

Dynamic pages (JS-rendered prices)

Use the browser tool with snapshot + act to navigate and extract.

Anti-bot considerations

  • Randomize delays (2-5s between requests)
  • Rotate User-Agent strings
  • Respect robots.txt — check before scraping
  • Limit concurrent requests per domain

Price Normalization

  • Strip currency symbols and whitespace
  • Convert per-unit pricing (e.g., "$2.99/oz" → unit price)
  • Handle price ranges (take lowest)
  • Flag "was/now" discounted prices — record both

Output Schema

{
  "query": "product name or url",
  "scraped_at": "ISO-8601",
  "results": [
    {
      "source": "amazon",
      "url": "https://...",
      "title": "Product Title",
      "price": 29.99,
      "currency": "USD",
      "unit_price": null,
      "original_price": 39.99,
      "in_stock": true,
      "rating": 4.5,
      "review_count": 1234
    }
  ],
  "summary": {
    "lowest": { "source": "amazon", "price": 29.99 },
    "highest": { "source": "walmart", "price": 34.99 },
    "average": 32.49,
    "median": 31.99
  }
}

Scripts

  • scripts/scrape_prices.py — main scraper: accepts product name or URL list, outputs structured JSON
  • See references/selectors.md for site-specific CSS selectors

Ethics & Compliance

  • Respect robots.txt and rate limits
  • Do not scrape behind login walls without explicit permission
  • Do not collect personal data
  • Check site ToS before bulk scraping

版本历史

共 1 个版本

  • v1.1.0 当前
    2026-05-07 05:02 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

data-analysis

Tavily 搜索

jacky1n7
通过 Tavily API 进行网页搜索(Brave 替代方案)。当用户要求搜索网页、查找来源或链接,且 Brave 网页搜索不可用时使用。
★ 278 📥 101,483
data-analysis

Data Analysis

ivangdavila
{"answer":"数据分析与可视化。查询数据库、生成报告、自动化电子表格,将原始数据转化为清晰可行的见解。适用于:(1) 您……"}
★ 216 📥 71,408
dev-programming

Code Formatter

terrycarter1985
代码格式化最佳实践及常用语言(Python、JavaScript、JSON、Markdown 等)的快速参考,使用 Prettier、Black、ESLint 等工具。
★ 0 📥 868