PinchTab is an HTTP server that provides programmatic control over a browser. It supports launching browser instances, navigating to pages, extracting page structure, and interacting with elements like buttons or forms.
Use this skill for tasks like:
Below is a guide to using the PinchTab skill:
You can launch a new browser instance via the API:
bash scripts/launch_browser.sh
Navigate to a URL with the following command:
bash scripts/navigate_to_url.sh https://example.com
Get the page structure and save it locally:
bash scripts/get_page_snapshot.sh
Simulate a button click on a webpage:
bash scripts/click_element.sh "<css_selector>"
Capture a screenshot, decode the base64, and send to Telegram:
Bash:
export PINCHTAB_TOKEN="your_token"
export TELEGRAM_BOT_TOKEN="your_bot_token"
bash scripts/screenshot_and_send.sh <tab_id> <telegram_chat_id>
Python (more features):
export PINCHTAB_TOKEN="your_token"
python3 scripts/decode_screenshot.py <tab_id> \
--output /path/to/screenshot.jpg \
--send-telegram <chat_id> \
--caption "My screenshot"
PinchTab successfully navigated to Google and extracted the page structure:
This demonstrates:
Check the references/ folder for detailed API documentation, common workflows, and troubleshooting tips.
共 1 个版本