← 返回
未分类 Key 中文

Brightdata

Google search results and web page scraping via Bright Data APIs. Use when the agent needs structured search results, paginated SERP retrieval, or clean mark...
通过 Bright Data API 获取 Google 搜索结果和网页抓取。适用于需要结构化搜索结果、分页 SERP 检索或干净标记的场景。
zhao-weijie zhao-weijie 来源
未分类 clawhub v1.0.0 1 版本 100000 Key: 需要
★ 1
Stars
📥 422
下载
💾 0
安装
1
版本
#latest

概述

Bright Data — Web Search & Scraping

Two tools: search (Google SERP results as JSON) and scrape (any URL to clean markdown). Both bypass bot detection and CAPTCHAs.

Tools

search.sh — Google Search

bash scripts/search.sh "<query>" [cursor]
  • query: Search terms (required).
  • cursor: Page number, 0-indexed (optional, default 0). Each page returns ~10 results.

Returns JSON with an organic array:

{
  "organic": [
    {"link": "https://...", "title": "...", "description": "..."}
  ]
}

scrape.sh — Web Page Scraping

bash scripts/scrape.sh "<url>"
  • url: Any public URL (required).

Returns the page content as clean markdown.

Agent Strategy Guide

When to search

  • You need to discover URLs, find information across the web, or locate specific pages.
  • Always craft specific, targeted queries. Include key terms, dates, or domain constraints. Vague single-word queries waste API calls and return noise.
  • Good: "site:github.com openai whisper python library"
  • Bad: "AI"

When to scrape

  • You already have a URL and need its full content (not just the snippet from search).
  • The search result snippet is insufficient to answer the user's question.
  • You need to extract structured data, read documentation, or parse a specific page.

When to paginate

  • The current page of search results has relevant hits but you need more. Increment cursor (0 → 1 → 2 → ...).
  • Stop paginating when results become irrelevant to the query or you have enough information to proceed.

When to stop

  • You have gathered enough information to answer the user's question — do not over-fetch.
  • Search results have become irrelevant (diminishing returns after 2-3 pages is typical).
  • A scrape returns an error or empty content — skip that URL and move on, do not retry.

General principles

  • Search first, scrape second. Use search to find the right URLs, then scrape only the promising ones.
  • Be specific in queries. The more precise your search query, the fewer API calls you need.
  • Summarize as you go. After each search or scrape, extract what you need immediately rather than batching all processing to the end.

版本历史

共 1 个版本

  • v1.0.0 当前
    2026-05-07 06:13 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

data-analysis

Tavily 搜索

jacky1n7
通过 Tavily API 进行网页搜索(Brave 替代方案)。当用户要求搜索网页、查找来源或链接,且 Brave 网页搜索不可用时使用。
★ 273 📥 100,530
data-analysis

Data Analysis

ivangdavila
{"answer":"数据分析与可视化。查询数据库、生成报告、自动化电子表格,将原始数据转化为清晰可行的见解。适用于:(1) 您……"}
★ 211 📥 69,013
business-ops

Pain to MVP

zhao-weijie
从公开的用户讨论中发现并梳理产品机会,随后将优选机会转化为适用于编码代理的轻量级需求文档。适用于...
★ 0 📥 455