← 返回
未分类 中文

Gbrow

Full-featured headless browser for OpenClaw agents. Navigate, snapshot with accessibility tree (@ref clicks), tabs, JS execution, cookie import. No vision mo...
全功能无头浏览器,面向 OpenClaw 代理。支持导航、无障碍树快照(@ref clicks)、多标签页、JS 执行、Cookie 导入。无视觉模式...
ashish797 ashish797 来源
未分类 clawhub v1.0.0 1 版本 100000 Key: 无需
★ 0
Stars
📥 299
下载
💾 0
安装
1
版本
#latest

概述

Gbrow — The Browser Your AI Agent Actually Needs

A full-featured headless browser powered by Playwright and Bun. Uses the accessibility tree for page reading — not expensive vision models.

Why Gbrow?

Traditional (screenshots + vision)Gbrow (accessibility tree)
------
Screenshot → upload to GPT-4o → wait → readariaSnapshot() → instant structured text
~$0.01 per page readFree
3-10 seconds per page< 100ms
Fails on API key issuesAlways works
Click by fragile CSS selectorClick by @ref (@e1, @e2, etc.)

Quick Setup

# Clone and install
git clone https://github.com/ashish797/Gbrow.git ~/.openclaw/workspace/skills/Gbrow
cd ~/.openclaw/workspace/skills/Gbrow
bash setup.sh

Or one-liner:

curl -fsSL https://raw.githubusercontent.com/ashish797/Gbrow/main/setup.sh | bash

How It Works

1. Start the server

cd ~/.openclaw/workspace/skills/Gbrow
bun run src/server.ts

2. Read the page (accessibility tree)

The snapshot gives you a structured view with clickable refs:

@e1 [heading] "Welcome" [level=1]
@e2 [link] "Get Started"
@e3 [button] "Sign in"
@e4 [textbox] "Search"

3. Click by ref

click @e2     → clicks "Get Started"
fill @e4 "query"  → types into search box

Commands

Navigation

CommandDescriptionExample
-------------------------------
goto Navigate to URLgoto https://example.com
backHistory backback
forwardHistory forwardforward
reloadReload pagereload
urlPrint current URLurl

Reading

CommandDescriptionExample
-------------------------------
snapshotAccessibility tree with @refssnapshot -i (interactive only)
textCleaned page texttext
html [selector]Raw HTMLhtml .article
linksAll links as "text → href"links
formsForm fields as JSONforms

Interaction

CommandDescriptionExample
-------------------------------
click Click elementclick @e3
fill Fill inputfill @e4 "hello"
select Select dropdownselect @e5 "option1"
type Type with keyboardtype @e4 "search term"
press Press keypress Enter
scroll Scroll pagescroll down

Inspection

CommandDescriptionExample
-------------------------------
js Run JavaScriptjs document.title
css Computed CSScss .box color
attrs Element attributesattrs @e1
is State checkis visible @e3

Tabs

CommandDescription
----------------------
tabsList open tabs
tab NSwitch to tab N
newtabOpen new tab
closetabClose current tab

Visual

CommandDescription
----------------------
screenshotTake screenshot
responsive Set viewport size
pdfSave page as PDF

Snapshot Flags

FlagDescription
-------------------
-iInteractive elements only (buttons, links, inputs)
-cCompact (remove empty structural nodes)
-d NLimit tree depth
-s Scope to CSS selector
-DDiff against previous snapshot
-aAnnotated screenshot with ref overlays

HTTP API

All commands go through the HTTP API:

# Get port and token from state file
PORT=$(python3 -c "import json; print(json.load(open('.gstack/browse.json'))['port'])")
TOKEN=$(python3 -c "import json; print(json.load(open('.gstack/browse.json'))['token'])")

# Send command
curl -s -X POST "http://127.0.0.1:${PORT}/command" \
  -H "Authorization: Bearer ${TOKEN}" \
  -H "Content-Type: application/json" \
  -d '{"command":"goto","args":["https://example.com"]}'

Architecture

┌─────────────┐     HTTP      ┌──────────────────┐
│  OpenClaw   │ ──────────▶  │  Gbrow Server    │
│  Agent      │              │  (Bun + Playwright)│
└─────────────┘              └────────┬─────────┘
                                      │
                                      ▼
                              ┌──────────────────┐
                              │  Chromium         │
                              │  (headless)       │
                              └──────────────────┘
                                      │
                                      ▼
                              ┌──────────────────┐
                              │ Accessibility     │
                              │ Tree (ariaSnapshot)│
                              └──────────────────┘

No vision models. No API calls. Just structured text from the browser's accessibility layer.

Credits

Built on top of gstack by Gary Tan (Y Combinator). Adapted for OpenClaw with permission under MIT license.

License

MIT

版本历史

共 1 个版本

  • v1.0.0 当前
    2026-05-07 14:54 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

ai-agent

Agent Browser

rez0
用于 AI 代理的浏览器自动化 CLI。当用户需要与网站交互(包括浏览页面、填写表单、点击按钮、截图等)时使用。
★ 865 📥 344,268
ai-agent

self-improving agent

pskoett
记录自身发现以实现自我改进的技能
★ 4,163 📥 935,336
ai-agent

Self-Improving + Proactive Agent

ivangdavila
自我反思+自我批评+自我学习+自组织记忆。智能体评估自身工作、发现错误并持续改进。
★ 1,441 📥 328,430