← 返回
内容创作 Key 中文

Headless Brave Browser

Headless web search and content extraction via the Brave Search API. Features exponential-backoff retry, circuit breaker fault isolation, bounded-concurrency...
通过 Brave Search API 进行无头网页搜索与内容提取。具备指数退避重试、熔断器故障隔离、有界并发等功能。
kelexine
内容创作 clawhub v0.2.0 1 版本 100000 Key: 需要
★ 2
Stars
📥 1,077
下载
💾 38
安装
1
版本
#latest

概述

brave-search

Headless web search and content extraction via the Brave Search API.

Setup

Run once before first use:

cd <skill-root>
npm ci

Required environment variable:

export BRAVE_API_KEY="your-key-here"

Get a free API key at brave.com/search/api.

Usage

Search

node scripts/search.js "query"                        # Basic (5 results)
node scripts/search.js "query" -n 10                  # Up to 20 results
node scripts/search.js "query" --content              # Include page content
node scripts/search.js "query" -n 3 --content         # Combined
node scripts/search.js "query" --json                 # Newline-delimited JSON
node scripts/search.js --help                         # Full options + env vars

Extract page content

node scripts/content.js https://example.com/article
node scripts/content.js https://example.com/article --json
node scripts/content.js https://example.com/article --max-length 8000

Output format (plain text)

--- Result 1 ---
Title:   Page Title
URL:     https://example.com/page
Snippet: Description from Brave Search
Content:
  # Page Title

  Extracted markdown content...

--- Result 2 ---
...

Pass --json to get one JSON object per line instead, suitable for piping.

Exit codes

CodeMeaning
--------------------------------------------------------
0Success
1Invalid input or configuration error
2Page had no extractable content (content.js)
130Interrupted (SIGINT)

Configuration (environment variables)

All behaviour is configurable without touching code:

VariableDefaultDescription
----------------------------------------------------------------------------------
BRAVE_API_KEYRequired. Brave Search subscription token
LOG_LEVELinfodebug · info · warn · error · silent
LOG_JSONfalseEmit logs as newline-delimited JSON to stderr
FETCH_TIMEOUT_MS15000Per-page fetch timeout
SEARCH_TIMEOUT_MS10000Brave API call timeout
MAX_CONTENT_LENGTH5000Max chars of extracted content
MAX_RETRY_ATTEMPTS3Retry attempts on transient errors
RETRY_BASE_DELAY_MS500Base delay for exponential backoff
RETRY_MAX_DELAY_MS30000Backoff delay cap
CONCURRENCY_LIMIT3Parallel page fetches when --content is set
CB_FAILURE_THRESHOLD5Consecutive failures before circuit opens
CB_RESET_TIMEOUT_MS60000Circuit breaker reset window

All variables are validated at startup — misconfigured runs fail immediately with a descriptive

list of every bad value rather than crashing mid-execution.

Architecture

See references/ARCHITECTURE.md for a full module breakdown.

scripts/
├── search.js            ← Search CLI entry point
├── content.js           ← Content extraction CLI entry point
├── content-fetcher.js   ← HTTP fetch + Readability + DOM fallback
├── config.js            ← Schema-validated env config
├── circuit-breaker.js   ← Fault isolation (CLOSED → OPEN → HALF_OPEN)
├── retry.js             ← Exponential backoff with full jitter
├── concurrency.js       ← Bounded parallel execution pool
├── utils.js             ← htmlToMarkdown, smartTruncate, parseURL
├── logger.js            ← Structured leveled logger → stderr
└── errors.js            ← Typed error hierarchy

版本历史

共 1 个版本

  • v0.2.0 当前
    2026-03-29 06:01 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

content-creation

Humanizer

biostartechnology
消除AI写作痕迹,使文本更自然真实。基于维基百科"AI写作特征"指南,识别并修正夸张象征、宣传用语、肤浅-ing分析、模糊归因、破折号滥用、三项排比、AI词汇、负面平行结构及冗长连接词等模式。
★ 860 📥 199,820
content-creation

YouTube

byungkyu
使用托管OAuth集成YouTube Data API,支持搜索视频、管理播放列表、获取频道数据及评论互动,适用于用户需要时使用此技能。
★ 142 📥 41,068
content-creation

AdMapix

fly0pants
广告情报与应用数据分析助手,支持搜索广告素材、分析应用排名、下载量、收入及市场洞察,用于广告素材和竞品分析。
★ 295 📥 136,487