← 返回
开发者工具 Key

Gemini Computer Use

Build and run Gemini 2.5 Computer Use browser-control agents with Playwright. Use when a user wants to automate web browser tasks via the Gemini Computer Use model, needs an agent loop (screenshot → function_call → action → function_response), or asks to integrate safety confirmation for risky UI actions.
使用 Playwright 构建并运行 Gemini 2.5 计算机控制浏览器代理。适用于通过 Gemini 计算机使用模型自动化网页任务、需要代理循环(截图→函数调用→操作→函数响应),或为危险 UI 操作集成安全确认的场景。
am-will
开发者工具 clawhub v1.0.0 1 版本 99120.4 Key: 需要
★ 5
Stars
📥 4,295
下载
💾 472
安装
1
版本
#latest

概述

Gemini Computer Use

Quick start

  1. Source the env file and set your API key:

```bash

cp env.example env.sh

$EDITOR env.sh

source env.sh

```

  1. Create a virtual environment and install dependencies:

```bash

python -m venv .venv

source .venv/bin/activate

pip install google-genai playwright

playwright install chromium

```

  1. Run the agent script with a prompt:

```bash

python scripts/computer_use_agent.py \

--prompt "Find the latest blog post title on example.com" \

--start-url "https://example.com" \

--turn-limit 6

```

Browser selection

  • Default: Playwright's bundled Chromium (no env vars required).
  • Choose a channel (Chrome/Edge) with COMPUTER_USE_BROWSER_CHANNEL.
  • Use a custom Chromium-based executable (e.g., Brave) with COMPUTER_USE_BROWSER_EXECUTABLE.

If both are set, COMPUTER_USE_BROWSER_EXECUTABLE takes precedence.

Core workflow (agent loop)

  1. Capture a screenshot and send the user goal + screenshot to the model.
  2. Parse function_call actions in the response.
  3. Execute each action in Playwright.
  4. If a safety_decision is require_confirmation, prompt the user before executing.
  5. Send function_response objects containing the latest URL + screenshot.
  6. Repeat until the model returns only text (no actions) or you hit the turn limit.

Operational guidance

  • Run in a sandboxed browser profile or container.
  • Use --exclude to block risky actions you do not want the model to take.
  • Keep the viewport at 1440x900 unless you have a reason to change it.

Resources

  • Script: scripts/computer_use_agent.py
  • Reference notes: references/google-computer-use.md
  • Env template: env.example

版本历史

共 1 个版本

  • v1.0.0 当前
    2026-03-28 10:26 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

developer-tools

Gog

steipete
Google Workspace 命令行工具,支持 Gmail、日历、云端硬盘、通讯录、表格和文档。
★ 920 📥 185,727
developer-tools

Github

steipete
使用 `gh` CLI 与 GitHub 交互,通过 `gh issue`、`gh pr`、`gh run` 和 `gh api` 管理议题、PR、CI 运行及高级查询。
★ 666 📥 323,793
developer-tools

CodeConductor.ai

larsonreever
AI驱动平台,提供快速全栈开发、智能体、工作流自动化及低代码AI集成的可扩展产品创建。
★ 65 📥 179,843