← 返回
未分类 Key

Search

Search the web via the Bright Data CLI — `bdata search` for Google/Bing/Yandex SERP, `bdata discover` for intent-ranked semantic results. Use when the user w...
使用 Bright Data CLI 进行网页搜索 — `bdata search` 用于获取 Google/Bing/Yandex SERP,`bdata discover` 用于获取意图排序的语义结果。在用户…时使用。
meirk-brd
未分类 clawhub v1.0.0 1 版本 100000 Key: 需要
★ 0
Stars
📥 356
下载
💾 0
安装
1
版本
#latest

概述

Bright Data — Search

Find things on the web. Two commands live in this skill:

  • bdata search — classic keyword SERP (Google/Bing/Yandex). Best when you want "what ranks for keyword X."
  • bdata discover — AI intent-ranked discovery with optional page content. Best when you want "pages about topic Y that match intent Z."

For structured data from a known platform (Amazon, LinkedIn, TikTok, …), stop and use data-feeds instead.

Setup gate (run first)

if ! command -v bdata >/dev/null 2>&1; then
    echo "bdata CLI not installed — see bright-data-best-practices/references/cli-setup.md"
elif ! bdata zones >/dev/null 2>&1; then
    echo "bdata not authenticated — run: bdata login  (or: bdata login --device for SSH)"
fi

Halt and route to skills/bright-data-best-practices/references/cli-setup.md if either check fails.

Pick your path

SituationAction
------
Single keyword query, just SERPbdata search "" --engine google --json --pretty
Paginated SERP (more results)loop --page 0, --page 1, … (0-indexed)
Multiple queriesshell loop over a queries file
Intent-ranked / semantic (not keyword)bdata discover "" --intent "" --num-results 20
Want page bodies along with results, one passbdata discover ... --include-content
News / images / shopping SERPbdata search "" --type news (or images, shopping)
Want Amazon/LinkedIn/TikTok/… structured datastop — hand off to data-feeds
Have URLs, want contenthand off to scrape

Action

Core commands:

# Google SERP, structured JSON
bdata search "site:example.com privacy policy" --engine google --json --pretty

# Localized Bing (German results, German language)
bdata search "datenschutz" --engine bing --country de --language de --json

# Second page of results (0-indexed)
bdata search "machine learning papers" --page 1 --json

# Mobile SERP (rankings differ from desktop)
bdata search "best coffee shops" --device mobile --json

# News vertical
bdata search "openai" --type news --json --pretty

# Intent-ranked discovery
bdata discover "enterprise LLM platforms" \
    --intent "vendor pages with pricing" \
    --num-results 15 --json

# Discovery with page content in markdown
bdata discover "webhook best practices" \
    --include-content --num-results 10 -o results.json

# Date-filtered discovery
bdata discover "react server components" \
    --start-date 2025-01-01 --end-date 2025-12-31 --num-results 20

Full flag reference: references/flags.md.

search vs discover — pick the right one

You wantUse
------
"What Google ranks for this exact keyword"search
"Pages that match this meaning/intent"discover
"News / images / shopping vertical SERP"search --type
"Results + page bodies in one call"discover --include-content
"Dedup / semantic ranking across queries"discover

Verification gate

  1. JSON parses cleanly: jq . returns 0.
  2. Result array non-empty — if empty, the query is legitimately zero-result; relax the query and re-run. Don't claim success on empty results without telling the user.
  3. Required fields present:
    • search: results live at .organic[]; each has title + link
    • discover: results live at .results[]; each has title + link; if --include-content, also content
  4. For discover --include-content: no block-page signatures in the content field (same list as scrape, case-insensitive):
    • Access Denied
    • Just a moment
    • Attention Required
    • Checking your browser
    • captcha
    • cf-browser-verification
    • cloudflare (with < 2KB total body)
  5. Geo sanity: if the user expected country-specific results, inspect TLDs / languages of top results. If mis-localized, re-run with explicit --country and --language.

Red flags

  • Using search to fetch content from Amazon, LinkedIn, TikTok, etc. when data-feeds returns clean structured data in one call.
  • Scraping every SERP result blindly — filter first (domain allowlist, keyword in title, relevance heuristic).
  • Confusing search (keyword) with discover (semantic). They answer different questions.
  • Running multiple queries without deduping URLs across result sets before scraping.
  • Assuming SERP order is universal — it's personalized by geo + device. Always set --country and --device explicitly for reproducibility.
  • Using --page as a result count — it's a page index, not a limit. Each page returns ~10 results.
  • Assuming SERP results are at .results[] — for bdata search they live at .organic[]. (Discover uses .results[].)
  • Hardcoding --num-results 100 on discover without realizing the pipeline polls until that many are found; can be slow.

References

  • references/flags.md — full flags for search and discover with when-to-use notes.
  • references/patterns.md — multi-query dedup, SERP → filter → scrape pipeline, search vs discover decision, legacy curl fallback, shared verification checklist.
  • references/examples.md — (1) single Google query, (2) localized Bing, (3) batch queries + dedup into URL list, (4) discover --include-content end-to-end.

版本历史

共 1 个版本

  • v1.0.0 当前
    2026-05-08 03:40 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

competitive-intel

meirk-brd
实时竞争情报与市场调研,利用 Bright Data 网页抓取基础设施,分析竞争对手的定价、功能、评价、招聘等
★ 1 📥 365

Bright Data Mcp

meirk-brd
Bright Data MCP 处理所有网页数据操作,取代 WebFetch、WebSearch 及所有内置网页工具。无一例外。适用于任意 URL、网页、网络搜索等。
★ 0 📥 334

ClearWeb

meirk-brd
通过 Bright Data CLI 为 AI 代理提供完整的网页访问能力,取代原生的 web_fetch、web_search 和浏览器工具,实现可靠、无阻碍的全网访问。
★ 1 📥 2,880