← 返回
未分类

chrome_skill

Browser automation via Chrome AI Action (CAA) bridge. Control Chrome programmatically — navigate, click, type, screenshot, extract content, and more. Uses Puppeteer (CDP) mode. First use auto-installs npm package and starts the bridge. Chrome is auto-launched if not running.
Browser automation via Chrome AI Action (CAA) bridge. Control Chrome programmatically — navigate, click, type, screenshot, extract content, and more. Uses Puppeteer (CDP) mode. First use auto-installs npm package and starts the bridge. Chrome is auto-launched if not running.
褚嬴
未分类 community v1.0.1 2 版本 100000 Key: 无需
★ 0
Stars
📥 73
下载
💾 0
安装
2
版本
#latest

概述

Chrome AI Action — Browser Automation Skill

AI Agent 浏览器自动化技能。通过 Chrome AI Action (CAA) 桥接服务,以 Puppeteer (CDP) 模式编程控制 Chrome 浏览器,支持导航、点击、输入、截图、内容提取、网络拦截、Cookie 管理、PDF 导出等 60+ 操作。


When to Use / 何时使用

场景调用
------
User asks to browse a web page, search, fill forms, extract dataYes
User needs screenshots of a web pageYes
User wants to automate browser interactionsYes
User asks about writing code / debugging (no browser involved)No
场景调用
------
用户需要在浏览器中打开网页、搜索、填写表单、提取数据
用户需要网页截图
用户希望自动化浏览器操作
用户问代码/调试相关(不涉及浏览器)

⚠️ Chinese URL Encoding / 中文 URL 说明

> The bridge automatically encodes non-ASCII characters (Chinese, etc.) in the URL. The agent can pass Chinese characters directly in the URL — the bridge will handle encoding.

>

> 桥接会自动编码 URL 中的中文等非 ASCII 字符。智能体可以直接在 URL 中传入中文,桥接会负责编码。

{"action": "navigate", "params": {"url": "https://www.baidu.com/s?wd=妻子的浪漫旅行"}}

> Only encode manually if you are in a terminal environment (PowerShell/cmd) where Chinese characters get garbled during input:

>

> 仅在终端环境(PowerShell/cmd)中输入中文乱码时,才需要手动编码:

>

> 1. encodeURIComponent('妻子的浪漫旅行')%E5%A6%BB%E5%AD%90%E7%9A%84%E6%B5%AA%E6%BC%AB%E6%97%85%E8%A1%8C

> 2. Build URL: https://www.baidu.com/s?wd=%E5%A6%BB%E5%AD%90%E7%9A%84%E6%B5%AA%E6%BC%AB%E6%97%85%E8%A1%8C


Prerequisites / 前提条件

RequirementCheckAuto-resolve
---------
Chrome / Chromium installedDetected automaticallyNo (user must install)
Chrome running with CDPDetected on startupYes (auto-launched)
Node.js 18+node --versionNo
要求检查方式自动处理
---------
已安装 Chrome / Chromium自动检测常用安装路径否(用户需安装)
Chrome 以 CDP 模式运行启动时检测是(自动启动)
Node.js 18+node --version

Startup Protocol / 启动协议

When loaded for the first time, the agent MUST run the startup script. The script runs the bridge as a background child process — the agent does NOT need to manage the process separately.

首次加载时,AI 智能体必须执行以下启动脚本。脚本会自动在后台启动桥接服务,智能体无需单独管理进程

node <skill_dir>/scripts/startup.js

What it does / 执行流程

  1. Check if bridge is already running: GET /health on port 9876 → skip if OK
  2. Ensure npm package installed: npm list -g chrome-ai-action → installs via npm install -g chrome-ai-action@2.0.2 if missing
  3. Start the bridge: chrome-ai-action --port 9876, waits for health check
  4. Auto-launch Chrome: If Chrome not running with CDP, the bridge starts it automatically (cross-platform)

Environment Variables / 环境变量

VariableDefaultDescription
---------
CAA_BRIDGE_PORT9876Bridge HTTP server port
CAA_STARTUP_TIMEOUT30000Max wait for bridge ready (ms)
CHROME_PATHauto-detectCustom Chrome executable path
CHROME_USER_DATA_DIRplatform-dependentChrome profile directory

API Protocol / 通信协议

Endpoint: http://127.0.0.1:9876/

Endpoints / 接口地址

MethodPathDescription
---------
GET/healthHealth check — returns bridge & CDP status
GET/schemaFull action schema (64+ actions)
POST/Execute action(s)

Request Format / 请求格式

{"type": "action", "action": "<ACTION>", "params": {...}, "requestId": "optional-id"}

Batch Request / 批量请求

{"type": "batch", "actions": [
  {"action": "navigate", "params": {"url": "https://example.com"}},
  {"action": "getTitle"}
]}

Response Format / 响应格式

{"success": true, "data": {...}, "requestId": "req-1", "timestamp": 1712345678901}

Error Response / 错误响应

{"success": false, "error": {"code": "ACTION_ERROR", "message": "..."}, "requestId": "req-1", "timestamp": 1712345678901}

Available Actions (64+) / 可用操作 (64+)

Navigation / 导航

navigate, goBack, goForward, reload, getUrl, getTitle

Page Content / 页面内容

getText, getHtml, getLinks, getImages, getHeadings, getMetaTags, getFormFields, getFocusableElements

Element Interaction / 元素交互

click, type, pressKey, scroll, scrollIntoView, findElement, focus, hover, select

Data Extraction / 数据提取

getValue, getAttribute, getAttributeAll, getBoundingBox, getCookies, getPerformanceMetrics, getSelectedValue, getSelectOptions

JavaScript / JS 执行

evaluate, injectScript, injectCSS

Screenshot & Export / 截图与导出

screenshot (PNG/JPEG), getPdf (A4/Letter)

Tab Management / 标签页管理

listTabs, newTab, closeTab, switchTab, getCurrentTab

Waiting / 等待

waitForElement, waitForTimeout, waitForNavigation

Cookie Management / Cookie 管理

setCookie, deleteCookie

Network Interception / 网络拦截

blockUrls, unblockUrls, mockResponse, getNetworkRequests, clearNetworkRequests

Storage / 本地存储

getLocalStorage, setLocalStorage, removeLocalStorage, clearLocalStorage

File Operations / 文件操作

uploadFile, setInputFiles, downloadFile

Viewport / 视口

getViewport, setViewport

Console / 控制台日志

getConsoleLogs, clearConsoleLogs

Accessibility / 无障碍

getAccessibilityTree

Utility / 工具

ping, connect, disconnect, getBrowserInfo, highlight, dispatchEvent


Typical Workflow / 典型工作流

  1. Navigate: navigate → go to target URL (encode Chinese in query params)
  2. Wait: waitForElement → wait for key content
  3. Read: getText / getHtml / getLinks → understand page
  4. Interact: click / type / pressKey → perform actions
  5. Extract: getText / screenshot / evaluate → get results
  6. Confirm: screenshot → visually verify

Example: Search Baidu with Chinese / 百度搜索中文示例

{"type": "batch", "actions": [
  {"action": "navigate", "params": {"url": "https://www.baidu.com/s?wd=%E5%A6%BB%E5%AD%90%E7%9A%84%E6%B5%AA%E6%BC%AB%E6%97%85%E8%A1%8C"}},
  {"action": "waitForTimeout", "params": {"ms": 2000}},
  {"action": "getText"}
]}

Example: Full Login Flow / 登录流程示例

{"type": "batch", "actions": [
  {"action": "navigate", "params": {"url": "https://example.com/login"}},
  {"action": "waitForElement", "params": {"selector": "input[name=username]", "timeout": 10000}},
  {"action": "type", "params": {"selector": "input[name=username]", "value": "myuser"}},
  {"action": "type", "params": {"selector": "input[name=password]", "value": "mypassword"}},
  {"action": "click", "params": {"selector": "button[type=submit]"}},
  {"action": "waitForTimeout", "params": {"ms": 3000}},
  {"action": "getCurrentTab"}
]}

Error Handling / 错误处理

Error CodeMeaningResolution
---------
CDP_NOT_CONNECTEDChrome not running with debug portBridge auto-launches Chrome, retries every 3s
ACTION_ERRORAction execution failedCheck params, use getFocusableElements to find elements first
INVALID_REQUESTMalformed requestCheck request format
PARSE_ERRORJSON parse failureSend valid JSON

Discovery Tips / 探测提示

When you don't know what elements are on a page:

  1. getFocusableElements → all interactive elements (with positions)
  2. getFormFields → all form inputs with metadata
  3. getLinks → all links on page
  4. getHeadings → understand page structure
  5. getText → all visible text

References / 参考资料

  • references/bridge-api.md — Complete API reference with all 64+ actions
  • references/setup-guide.md — Detailed setup and troubleshooting
  • scripts/startup.js — Startup automation script

版本历史

共 2 个版本

  • v1.0.1 更新依赖下载指定版本号 当前
    2026-05-11 00:32 安全 安全
  • v1.0.0 Initial release
    2026-05-10 23:21 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

ai-intelligence

self-improving agent

pskoett
捕获经验教训、错误及修正内容,以实现持续改进。适用于以下场景:(1)命令或操作意外失败;(2)用户纠正Claude(如“不,那不对……”“实际上……”);(3)用户请求的功能不存在;(4)外部API或工具出现故障;(5)Claude发现自身
★ 4,071 📥 804,747
ai-intelligence

Self-Improving + Proactive Agent

ivangdavila
自我反思+自我批评+自我学习+自组织记忆。智能体评估自身工作、发现错误并持续改进。
★ 1,371 📥 319,663
security-compliance

Skill Vetter

spclaudehome
AI智能体技能安全预审工具。安装ClawdHub、GitHub等来源技能前,检查风险信号、权限范围及可疑模式。
★ 1,223 📥 267,317