← 返回
未分类 Key 中文

Droidrun Agent

DroidRun Portal HTTP/WebSocket/MCP client. Controls Android devices via HTTP, WebSocket, or MCP server, supporting tap, swipe, screenshot, text input, UI sta...
DroidRun 门户 HTTP/WebSocket/MCP 客户端,可通过 HTTP/WebSocket/MCP 服务器控制 Android 设备,支持点击、滑动、截图、文本输入、UI 状态等操作。
hanxi hanxi 来源
未分类 clawhub v0.1.0 1 版本 100000 Key: 需要
★ 0
Stars
📥 574
下载
💾 0
安装
1
版本
#latest

概述

Droidrun Agent

Provides two async clients (PortalHTTPClient and PortalWSClient), a configuration helper (PortalConfig), and a built-in MCP server for communicating with Android devices running DroidRun Portal. All client methods are async and support async with context managers.

Installation

cd droidrun-agent && uv sync              # core only
cd droidrun-agent && uv sync --extra mcp  # with MCP server support

PortalHTTPClient

Communicates with Portal's HTTP server (default port 8080) using Bearer token authentication.

from droidrun_agent import PortalHTTPClient

async with PortalHTTPClient(base_url="http://192.168.1.100:8080", token="YOUR_TOKEN") as client:
    await client.ping()
    state = await client.get_state_full()
    await client.tap(200, 400)
    png = await client.take_screenshot()

Query Methods (GET)

SignatureReturn TypeDescription
-------------------------------------
ping()dictHealth check, no auth required
get_a11y_tree()dictSimplified accessibility tree
get_a11y_tree_full(*, filter: bool = True)dictFull accessibility tree, filter=False keeps small elements
get_state()dictSimplified UI state
get_state_full(*, filter: bool = True)dictFull UI state (a11y_tree + phone_state), filter=False keeps small elements
get_phone_state()dictPhone state info (current app, activity, keyboard status, etc.)
get_version()strPortal app version string
get_packages()list[dict]List of launchable apps, each containing packageName, label, etc.
take_screenshot(*, hide_overlay: bool = True)bytesDevice screenshot as PNG bytes, hide_overlay=False to show overlay

Action Methods (POST)

SignatureReturn TypeDescription
-------------------------------------
tap(x: int, y: int)dictTap screen coordinates
`swipe(start_x: int, start_y: int, end_x: int, end_y: int, duration: int \None = None)`dictSwipe gesture, duration is optional duration in milliseconds
global_action(action: int)dictExecute Android accessibility global action (1=Back, 2=Home, 3=Recents)
`start_app(package: str, activity: str \None = None, stop_before_launch: bool = False)`dictLaunch an app
stop_app(package: str)dictBest-effort stop an app
input_text(text: str, clear: bool = True)dictInput text (auto base64-encoded), clear=True clears field first
clear_input()dictClear the focused input field
press_key(key_code: int)dictSend Android key code (e.g. 66=Enter, 3=Home, 4=Back)
set_overlay_offset(offset: int)dictSet overlay vertical offset in pixels
set_socket_port(port: int)dictUpdate the HTTP server port

PortalWSClient

Communicates with Portal's WebSocket server (default port 8081) using JSON-RPC style messages. Automatically reconnects when a method is called on a broken connection.

from droidrun_agent import PortalWSClient

async with PortalWSClient("ws://192.168.1.100:8081", token="YOUR_TOKEN") as ws:
    await ws.tap(200, 400)
    state = await ws.get_state()
    png = await ws.take_screenshot()
    time_ms = await ws.get_time()

Methods

Supports all action methods from PortalHTTPClient (tap, swipe, global_action, start_app, stop_app, input_text, clear_input, press_key, set_overlay_offset, set_socket_port, take_screenshot) with identical signatures.

Query methods:

SignatureReturn TypeDescription
-------------------------------------
get_packages()AnyList of launchable packages
get_state(*, filter: bool = True)AnyFull state, filter=False keeps small elements
get_version()AnyPortal version string
get_time()AnyDevice Unix timestamp in milliseconds
install(urls: list[str], hide_overlay: bool = True)AnyInstall APK(s) from URL(s), supports split APKs (WebSocket only)

WebSocket screenshots automatically parse binary frames and return PNG bytes directly.

Exceptions

All exceptions inherit from PortalError:

ExceptionTrigger
--------------------
PortalErrorBase exception
PortalConnectionErrorCannot connect to Portal server
PortalAuthErrorInvalid or missing token (HTTP 401/403)
PortalTimeoutErrorRequest timed out
PortalResponseErrorServer returned unexpected status or error

Full Usage Example

import asyncio
from droidrun_agent import PortalHTTPClient, PortalWSClient

async def demo_http():
    async with PortalHTTPClient("http://localhost:8080", token="YOUR_TOKEN") as client:
        print(await client.ping())
        print("Version:", await client.get_version())
        print("Packages:", len(await client.get_packages()))

        await client.tap(500, 800)
        await client.swipe(500, 1500, 500, 500, duration=300)
        await client.input_text("Hello World")
        await client.press_key(66)  # Enter

        state = await client.get_state_full()
        png = await client.take_screenshot()
        print(f"Screenshot: {len(png)} bytes")

async def demo_ws():
    async with PortalWSClient("ws://localhost:8081", token="YOUR_TOKEN") as ws:
        print("Version:", await ws.get_version())
        print("Time:", await ws.get_time())

        await ws.tap(500, 800)
        await ws.start_app("com.android.settings")

        png = await ws.take_screenshot()
        print(f"Screenshot: {len(png)} bytes")

asyncio.run(demo_http())
asyncio.run(demo_ws())

PortalConfig

Helper dataclass for managing connection settings. Supports direct construction or loading from environment variables.

from droidrun_agent import PortalConfig

# Direct construction
config = PortalConfig(base_url="http://192.168.1.100:8080", token="YOUR_TOKEN")
client = config.create_client()

# Load from environment variables
config = PortalConfig.from_env()
client = config.create_client()
FieldTypeDefaultDescription
-----------------------------------
base_urlstr(required)Portal HTTP or WebSocket base URL
tokenstr(required)Bearer authentication token
timeoutfloat10.0Request timeout in seconds
transportstr"http""http" or "ws"

Environment variables for from_env(): PORTAL_BASE_URL, PORTAL_TOKEN, PORTAL_TIMEOUT, PORTAL_TRANSPORT.

MCP Server

A built-in MCP (Model Context Protocol) server exposes all Portal operations as tools for AI agent integration. Requires the mcp optional dependency (pip install droidrun-agent[mcp]).

Starting the server

# Via CLI entry point
droidrun-agent --mcp

# Or as a Python module
python -m droidrun_agent --mcp

The server reads PORTAL_BASE_URL, PORTAL_TOKEN, PORTAL_TIMEOUT, and PORTAL_TRANSPORT from environment variables and communicates over stdio.

MCP Tools

ToolDescription
-------------------
portal_pingHealth check (HTTP only)
portal_tapTap screen coordinates
portal_swipeSwipe gesture
portal_screenshotTake screenshot, returns PNG image
portal_get_stateGet simplified UI state
portal_get_state_fullGet full UI state (a11y tree + phone state)
portal_get_a11y_treeGet simplified accessibility tree (HTTP only)
portal_get_a11y_tree_fullGet full accessibility tree (HTTP only)
portal_get_phone_stateGet phone state info (HTTP only)
portal_get_versionGet Portal app version
portal_get_packagesList launchable packages
portal_global_actionExecute accessibility global action (1=Back, 2=Home, 3=Recents)
portal_start_appLaunch an app by package name
portal_stop_appStop an app
portal_input_textInput text into focused field
portal_clear_inputClear focused input field
portal_press_keySend Android key code (66=Enter, 3=Home, 4=Back)
portal_set_overlay_offsetSet overlay vertical offset
portal_get_timeGet device timestamp (WebSocket only)
portal_installInstall APK(s) from URL(s) (WebSocket only)

openclaw integration

Register as an openclaw MCP skill:

{
  "mcpServers": {
    "droidrun-agent": {
      "command": "uvx",
      "args": ["--with", "mcp", "droidrun-agent", "--mcp"],
      "env": {
        "PORTAL_BASE_URL": "http://192.168.1.100:8080",
        "PORTAL_TOKEN": "YOUR_TOKEN"
      }
    }
  }
}

版本历史

共 1 个版本

  • v0.1.0 当前
    2026-03-30 18:49 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

ai-agent

Agent Browser

rez0
用于 AI 代理的浏览器自动化 CLI。当用户需要与网站交互(包括浏览页面、填写表单、点击按钮、截图等)时使用。
★ 865 📥 342,258
ai-agent

Self-Improving + Proactive Agent

ivangdavila
自我反思+自我批评+自我学习+自组织记忆。智能体评估自身工作、发现错误并持续改进。
★ 1,441 📥 327,984
ai-agent

self-improving agent

pskoett
记录自身发现以实现自我改进的技能
★ 4,161 📥 930,883