← 返回
未分类 中文

broswer use skill

Control Chrome browsers from the terminal via the AIPex extension. Use this skill when the agent needs to manage browser tabs, search page elements, click bu...
通过 AIPex 扩展在终端控制 Chrome 浏览器。当代理需要管理浏览器标签页、搜索页面元素、点击按钮...时使用此技能。
buttercannfly
未分类 clawhub v1.0.0 1 版本 100000 Key: 无需
★ 0
Stars
📥 363
下载
💾 0
安装
1
版本
#latest

概述

browser-cli — Terminal Browser Control

browser-cli is a command-line tool that controls Chrome browsers through the AIPex extension's WebSocket daemon. It translates shell commands into browser actions — managing tabs, clicking elements, filling forms, capturing screenshots, and more.

Architecture:

browser-cli ──WebSocket──▶ aipex-daemon ──WebSocket──▶ AIPex Chrome Extension ──▶ Browser APIs

The daemon auto-spawns on first use and self-terminates when idle. No manual setup beyond initial extension connection.


When to Use This Skill

Use this skill when the user wants to:

  • Control a Chrome browser from the terminal without an MCP client
  • Open, close, switch, or organize browser tabs via CLI
  • Search for page elements and interact with them (click, fill, hover)
  • Capture screenshots of browser tabs
  • Automate browser workflows in shell scripts or CI pipelines
  • Download page content as markdown or images
  • Request human input during automated browser tasks
  • Manage AIPex skills from the command line

Trigger phrases: "browser-cli", "control browser from terminal", "browser automation CLI", "click element from shell", "terminal browser control", "command line browser", "shell browser automation"


Prerequisites

  • Node.js >= 18 installed
  • AIPex Chrome extension installed and connected to the daemon
  • browser-cli installed globally: npm install -g browser-cli

First-time Setup

After installing, connect the AIPex extension to the daemon:

  1. Open Chrome → AIPex extension icon → Options
  2. Set WebSocket URL to ws://localhost:9223/extension
  3. Click Connect
  4. Verify: browser-cli status

Command Groups

tab — Manage browser tabs

browser-cli tab list                         # List all open tabs
browser-cli tab current                      # Get the active tab
browser-cli tab new https://example.com      # Open a new tab
browser-cli tab switch 42                    # Switch to tab by ID
browser-cli tab close 42                     # Close a tab
browser-cli tab info 42                      # Get tab details
browser-cli tab organize                     # AI-powered tab grouping
browser-cli tab ungroup                      # Remove all tab groups

page — Inspect and interact with page content

browser-cli page search "button*" --tab 123              # Search elements by glob pattern
browser-cli page search "{input,textarea}*" --tab 123    # Search multiple element types
browser-cli page screenshot                               # Screenshot active tab
browser-cli page screenshot-tab 123 --send-to-llm true   # Screenshot with LLM analysis
browser-cli page metadata --tab 123                       # Get page metadata
browser-cli page scroll-to "#main-content"                # Scroll to element
browser-cli page highlight "button.submit"                # Highlight element
browser-cli page highlight-text "p" "important"           # Highlight text in content

interact — Click, fill, hover, and type

browser-cli interact click btn-42 --tab 123                         # Click by UID
browser-cli interact fill input-5 "hello world" --tab 123           # Fill input by UID
browser-cli interact hover menu-3 --tab 123                         # Hover by UID
browser-cli interact form --tab 123 --elements '[{"uid":"in-1","value":"foo"}]'  # Batch fill
browser-cli interact editor editor-1 --tab 123                     # Get editor content
browser-cli interact upload --tab 123 --file-path /path/to/file    # Upload file
browser-cli interact computer --action left_click --coordinate "[500,300]"  # Pixel-level click

download — Save content locally

browser-cli download markdown --text "# Notes" --filename notes    # Save as markdown
browser-cli download image --data "data:image/png;base64,..."      # Save image
browser-cli download chat-images --messages '[...]' --folder imgs   # Batch save images

intervention — Request human input

browser-cli intervention list                                       # List intervention types
browser-cli intervention info voice-input                           # Get type details
browser-cli intervention request voice-input --reason "Need input"  # Request intervention
browser-cli intervention cancel                                     # Cancel active request

skill — Manage AIPex skills

browser-cli skill list                                    # List all skills
browser-cli skill load my-skill                           # Load skill content
browser-cli skill info my-skill                           # Skill details
browser-cli skill run my-skill scripts/init.js            # Execute skill script
browser-cli skill ref my-skill references/guide.md        # Read skill reference
browser-cli skill asset my-skill assets/icon.png          # Get skill asset

Standalone commands

browser-cli status    # Check daemon + extension connection
browser-cli update    # Self-update to latest version

Workflow: Search, Interact, Verify

The recommended pattern for browser automation:

# 1. Discover tabs
browser-cli tab list

# 2. Search for elements (fast, no screenshot needed)
browser-cli page search "{button,input,link}*" --tab 123

# 3. Interact using UIDs from search results
browser-cli interact click btn-submit --tab 123
# or
browser-cli interact fill input-email "user@example.com" --tab 123

# 4. Verify visually (only when needed)
browser-cli page screenshot

Login form example

browser-cli page search "{input,textbox}*" --tab 123
browser-cli interact fill input-email "user@example.com" --tab 123
browser-cli interact fill input-pass "secret" --tab 123
browser-cli page search "*[Ll]ogin*" --tab 123
browser-cli interact click btn-login --tab 123

Shell script example

#!/bin/bash
browser-cli tab new https://example.com
sleep 2
TAB_ID=$(browser-cli tab current | jq '.data.id')
browser-cli page search "link*" --tab "$TAB_ID"
browser-cli page screenshot

Global Options

OptionDefaultDescription
------------------------------
--port 9223Daemon WebSocket port
--host 127.0.0.1Daemon host address

Environment Variables

VariableDefaultDescription
--------------------------------
BROWSER_CLI_WS_URLws://127.0.0.1:9223/cliOverride daemon WebSocket URL
BROWSER_CLI_CONNECT_TIMEOUT60000Connection timeout (ms)

Troubleshooting

SymptomFix
--------------
Daemon not runningRun any command to auto-spawn, or check with browser-cli status
Extension is not connectedOpen AIPex Options → WebSocket URL ws://localhost:9223/extension → Connect
Port 9223 in useUse --port 9224 and update extension URL
Timeout after 60sVerify extension is connected. Increase with BROWSER_CLI_CONNECT_TIMEOUT=120000
0 search resultsTry different patterns. Fall back to page screenshot --send-to-llm true + interact computer
Wrong resultsVerify tab ID with browser-cli tab list

版本历史

共 1 个版本

  • v1.0.0 当前
    2026-05-07 09:58 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

ai-intelligence

Self-Improving + Proactive Agent

ivangdavila
自我反思+自我批评+自我学习+自组织记忆。智能体评估自身工作、发现错误并持续改进。
★ 1,351 📥 317,793
ai-intelligence

self-improving agent

pskoett
捕获经验教训、错误和纠正,以实现持续改进。使用时机:(1)命令或操作意外失败;(2)用户纠正……
★ 4,056 📥 796,694
security-compliance

Skill Vetter

spclaudehome
AI智能体技能安全预审工具。安装ClawdHub、GitHub等来源技能前,检查风险信号、权限范围及可疑模式。
★ 1,211 📥 266,244