← 返回
开发者工具 Key

Browser Use Pro

AI-powered browser automation for complex multi-step web workflows. Uses Browser-Use framework when OpenClaw's built-in browser tool can't handle login flows...
AI驱动的浏览器自动化工具,用于处理复杂的多步骤网页工作流。当OpenClaw内置浏览器工具无法处理登录流程时,使用Browser-Use框架。
abczsl520
开发者工具 clawhub v1.2.0 1 版本 100000 Key: 需要
★ 0
Stars
📥 1,206
下载
💾 148
安装
1
版本
#ai-agent#browser-automation#browser-use#chrome-cdp#latest#openclaw#playwright#python#rpa#web-automation#web-scraping

概述

Browser-Use — AI Browser Automation

Security & Privacy

  • No credential logging: Passwords are handled via Browser-Use's sensitive_data parameter — the LLM never sees real credentials, only placeholder tokens.
  • User-initiated Chrome connection: CDP mode (connecting to real Chrome) is opt-in and requires the user to manually launch Chrome with debug flag. The skill never silently connects to running browsers.
  • All packages are open-source: Dependencies are browser-use (38k+ ⭐ on GitHub), playwright (by Microsoft), and langchain-openai — all widely audited open-source tools.
  • Local execution only: Scripts run locally on the user's machine. No data is sent to any server except the configured LLM API for step-by-step reasoning.
  • Domain restriction available: Use allowed_domains parameter to restrict which websites the agent can visit.
  • No telemetry: This skill does not collect, store, or transmit any usage data.

When to Use Browser-Use vs Built-in Tool

ScenarioBuilt-in toolBrowser-Use
----------:-::-:
Screenshot / click one button✅ Free & fast❌ Overkill
5+ step workflow (login→navigate→fill→submit)❌ Breaks easily
Anti-bot sites (real Chrome needed)
Batch repetitive operations

Cost: Browser-Use calls an external LLM per step (costs money + slower). Use built-in tool for simple actions.

Execution Flow

1. Check Environment

test -d ~/browser-use-env && echo "Installed" || echo "Need install"

2. First-Time Setup (once only)

python3 -m venv ~/browser-use-env
source ~/browser-use-env/bin/activate
pip install browser-use playwright langchain-openai
playwright install chromium

3. Choose Mode

  • Mode A — Built-in Chromium: For simple automation or when detection doesn't matter. Runs immediately.
  • Mode B — Real Chrome CDP: For anti-bot sites or when user's login session is needed. Requires user action.

Mode B setup — prompt user:

> Please quit Chrome completely (Mac: Cmd+Q), then tell me "done"

After user confirms:

/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome --remote-debugging-port=9222 &

Verify: curl -s http://127.0.0.1:9222/json/version

4. Write Script and Run

Write script to user's workspace, then:

source ~/browser-use-env/bin/activate
python3 script_path.py

5. Report Results

Return results to user. On failure, follow the troubleshooting tree below.

Script Template

import asyncio
from browser_use import Agent, ChatOpenAI, Browser

async def main():
    # LLM — any OpenAI-compatible API
    llm = ChatOpenAI(
        model="gpt-4o-mini",
        api_key="<YOUR_API_KEY>",  # From env var or user config
        base_url="https://api.openai.com/v1",
    )

    # Mode A: Built-in Chromium
    browser = Browser(headless=False, user_data_dir="~/.browser-use/task-profile")
    # Mode B: Real Chrome (user must launch with --remote-debugging-port=9222)
    # browser = Browser(cdp_url="http://127.0.0.1:9222")

    agent = Agent(
        task="Detailed step-by-step task description (see guide below)",
        llm=llm, browser=browser,
        use_vision=True, max_steps=25,
    )
    result = await agent.run()
    print(result)

asyncio.run(main())

Task Writing Guide

✅ Good: Specific steps

task = """
1. Open https://www.reddit.com/login
2. Enter username: x_user
3. Enter password: x_pass
4. Click login button
5. If CAPTCHA appears, wait 30s for user to complete
6. Navigate to https://www.reddit.com/r/xxx/submit
7. Enter title: xxx
8. Enter body: xxx
9. Click submit
"""

❌ Bad: Vague

task = "Post something on Reddit"

Tips

  • Keyboard fallback: Add "If button can't be clicked, use Tab+Enter"
  • Error recovery: Add "If page fails to load, refresh and retry"
  • Sensitive data: Use placeholders + sensitive_data parameter

Credential Security

agent = Agent(
    task="Login with x_user and x_pass",
    sensitive_data={"x_user": "real@email.com", "x_pass": "S3cret!"},
    use_vision=False,  # Disable screenshots when handling passwords
    llm=llm, browser=browser,
)

Key Parameters

ParameterPurposeRecommended
---------------------------------
use_visionAI sees screenshotsTrue normally, False with passwords
max_stepsMax actions20-30
max_failuresMax retries3 (default)
flash_modeSkip reasoningTrue for simple tasks
extend_system_messageCustom instructionsAdd specific guidance
allowed_domainsRestrict URLsUse for security
fallback_llmBackup LLMWhen primary is unstable

Troubleshooting

Detected as automation?
  └→ Switch to Mode B (real Chrome)

CAPTCHA / human verification?
  └→ Prompt user to complete manually, add wait time in task

LLM timeout?
  └→ Set fallback_llm or use faster model

Action succeeded but no effect (e.g. post not published)?
  └→ 1. Check if platform anti-spam blocked it (common with new accounts)
     2. Add explicit confirmation steps to task

Website UI changed, can't find elements?
  └→ Browser-Use auto-adapts, but add fallback paths in task

LLM Compatibility

LLMWorksNotes
-----:---:-------
GPT-4o / 4o-miniBest choice, recommended
ClaudeWorks well
GeminiStructured output incompatible

版本历史

共 1 个版本

  • v1.2.0 当前
    2026-03-30 07:01 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

data-analysis

Debug Methodology

abczsl520
系统的调试与问题解决方法论。在遇到意外错误、服务故障、回归缺陷、部署问题等情况下激活。
★ 0 📥 1,571
developer-tools

Gog

steipete
Google Workspace 命令行工具,支持 Gmail、日历、云端硬盘、通讯录、表格和文档。
★ 920 📥 185,727
developer-tools

Github

steipete
使用 `gh` CLI 与 GitHub 交互,通过 `gh issue`、`gh pr`、`gh run` 和 `gh api` 管理议题、PR、CI 运行及高级查询。
★ 666 📥 323,792