This skill enables browser automation through Playwright and E2B sandbox, allowing programmatic control of web browsers for web scraping, testing, and interaction tasks.
Provide automated browser interaction capabilities including navigation, element manipulation, form filling, scrolling, and screenshot capture in a cloud-based sandbox environment.
The skill includes:
SandboxBrowserSkills with all interaction methodsFirst, import and start the browser sandbox:
from skills.browser_skills import SandboxBrowserSkills
# Create instance and start the browser
browser = SandboxBrowserSkills()
await browser.start(timeout=600) # timeout in seconds
# The VNC URL will be printed for visual monitoring
Navigate to a URL:
result = await browser.navigate("https://example.com")
print(result) # "已导航到: https://example.com"
Highlight and identify interactive elements:
elements = await browser.highlight_elements()
# Returns a list of highlighted elements with IDs for interaction
# Elements are color-coded by type (links, buttons, inputs)
Click an element by ID (from highlight):
result = await browser.click_element(5)
print(result) # "已点击元素 [5]"
Click an element by text content:
result = await browser.click_text("Submit")
print(result) # "已点击: Submit"
Fill an input field:
# Fill specific input by selector
result = await browser.fill_input("hello world", selector="input[type='text']")
# Or fill the currently focused input
result = await browser.fill_input("search term")
Scroll the page:
await browser.scroll_down(500) # Scroll down 500 pixels
await browser.scroll_up(300) # Scroll up 300 pixels
Take a screenshot:
result = await browser.screenshot("homepage.png")
print(result) # "已截图: homepage.png"
Get page content:
# Get current URL
url = await browser.get_current_url()
# Get page text content (limited length)
text = await browser.get_page_text(max_length=2000)
# Get specific element text
element_text = await browser.get_element_text("h1")
Execute JavaScript:
result = await browser.execute_script("document.title")
print(result) # Returns the page title
Press keyboard keys:
result = await browser.press_key("Enter")
result = await browser.press_key("Escape")
Wait for navigation:
result = await browser.wait_for_navigation(timeout=30000)
Always stop the browser when done to free resources:
await browser.stop()
The browser behavior is controlled by environment variables in config.py:
E2B_DOMAIN: E2B service domain (default: ap-guangzhou.tencentags.com)E2B_API_KEY: E2B API key for authenticationE2B_AGS_TEMPLATE: Sandbox template name (default: williamji-longtime)Set these as environment variables before starting the browser, or modify directly in config.py.
start(timeout: int = 600) - Initialize sandbox and connect browserstop() - Close browser and kill sandboxnavigate(url: str) - Navigate to specified URLget_current_url() - Get current page URLwait_for_navigation(timeout: int = 30000) - Wait for page loadhighlight_elements() - Highlight all interactive elements with IDsclick_element(element_id: int) - Click element by highlight IDclick_text(text: str) - Click element containing textfill_input(text: str, selector: str = None) - Fill text in input fieldpress_key(key: str) - Press keyboard keyget_page_text(max_length: int = 2000) - Extract page textget_element_text(selector: str) - Get text of specific elementexecute_script(script: str) - Execute JavaScript and return resultscroll_down(pixels: int = 500) - Scroll page downscroll_up(pixels: int = 500) - Scroll page upscreenshot(filename: str) - Capture screenshot and save to filestart() before any browser operationsstop() when finished to release sandbox resourceshighlight_elements() first to understand page structure before interactionCommon issues and solutions:
highlight_elements() to verify element exists共 1 个版本