← 返回
内容创作 中文

Chatgpt Image Gen

Generate images using ChatGPT/DALL-E through OpenClaw browser automation. Use when the user wants to create images via ChatGPT's web interface with their log...
通过OpenClaw浏览器自动化,利用ChatGPT/DALL-E生成图像。适用于用户希望借助ChatGPT网页界面及其登录凭据创作图片的场景。
sonim1
内容创作 clawhub v1.0.0 1 版本 100000 Key: 无需
★ 0
Stars
📥 656
下载
💾 77
安装
1
版本
#latest

概述

ChatGPT Image Generation

Generate images using ChatGPT's DALL-E integration through OpenClaw browser automation.

Prerequisites

  1. Chrome Extension Installation:
    • Install OpenClaw Browser Relay from Chrome Web Store
    • Or use the extension that comes with OpenClaw
  1. Initial Setup (one-time):
    • Open ChatGPT (chatgpt.com) in Chrome/Brave
    • Login to your ChatGPT account (Pro subscription recommended for best quality)
    • Click the OpenClaw extension icon on the ChatGPT tab to attach it
    • The badge should show "ON" when attached

How It Works

This skill uses OpenClaw's built-in browser tool with Chrome extension relay (profile="chrome") to control an already-logged-in ChatGPT tab. This bypasses ChatGPT's bot detection because it uses your real browser session.

CLI Command Reference

IMPORTANT: There is NO browser act subcommand. Each action is a direct subcommand.

ActionCLI Syntax
-------------------
List tabsopenclaw browser tabs
Snapshotopenclaw browser snapshot --target-id
Clickopenclaw browser click --target-id
Typeopenclaw browser type "" --target-id
Press keyopenclaw browser press --target-id
Navigateopenclaw browser navigate --target-id
Screenshotopenclaw browser screenshot --target-id
  • and are positional arguments (no --ref flag)
  • --target-id accepts a full ID or unique prefix (e.g. 77CB instead of 77CB8A574E8A44861C5FE49388EF6ABC)
  • --profile is a parent option on openclaw browser, not on subcommands

Workflow

1. List Attached Tabs

openclaw browser tabs

Look for a tab with URL containing chatgpt.com. Note the targetId.

2. Get Snapshot (find element refs)

openclaw browser snapshot --target-id <ID> --format ai --efficient

This outputs a tree with refs like e23, e589, etc. Always run snapshot before interacting.

3. Click an Element

openclaw browser click e23 --target-id <ID>

4. Type Text

openclaw browser type e589 "Generate an image: a futuristic city at sunset" --target-id <ID>

Add --submit to press Enter after typing:

openclaw browser type e589 "Generate an image: a cat riding a skateboard" --target-id <ID> --submit

5. Press a Key

openclaw browser press Enter --target-id <ID>

6. Wait for Generation

Use sleep to wait for DALL-E to generate (30-60 seconds):

sleep 45

Then take a new snapshot to check the result:

openclaw browser snapshot --target-id <ID> --format ai --efficient

Complete Example Session

# 1. List tabs, find the ChatGPT tab targetId
openclaw browser tabs

# 2. Take snapshot to find element refs
openclaw browser snapshot --target-id 4535E --format ai --efficient

# 3. Click input field (check ref from snapshot, usually labeled "Ask anything")
openclaw browser click e589 --target-id 4535E

# 4. Type prompt and submit
openclaw browser type e589 "Generate an image: a futuristic city at sunset" --target-id 4535E --submit

# 5. Wait for DALL-E generation
sleep 45

# 6. Take new snapshot to see result and find download button
openclaw browser snapshot --target-id 4535E --format ai --efficient

# 7. Click download button (ref from new snapshot)
openclaw browser click e745 --target-id 4535E

Troubleshooting

"Can't reach the OpenClaw browser control service":

  • Gateway restart needed: openclaw gateway restart
  • Or restart via OpenClaw menu bar app

"Chrome extension relay is running, but no tab is connected":

  • ChatGPT tab is not attached
  • Go to the ChatGPT tab and click the OpenClaw extension icon

"ref is required" error:

  • You need to specify which element to interact with
  • Run snapshot first to get the refs

Command not found / Unknown command:

  • Do NOT use browser act — use direct subcommands: browser click, browser type, browser press
  • ref is a positional argument: browser click e23, NOT browser click --ref e23

Image generation timeout:

  • DALL-E generation takes 30-60 seconds
  • Use sleep 45 then re-snapshot to check

Bot detection / Login issues:

  • The tab must be already logged in via your real browser
  • Use the Chrome extension relay (attached tab), not the isolated browser

Tips

  1. Keep ChatGPT tab open: Once attached, keep the tab open for future use
  2. Check targetId: The targetId changes if you close/reopen the tab — always run tabs first
  3. Use --submit: The type command supports --submit to press Enter automatically
  4. Unique prefix: --target-id accepts a unique prefix, no need for the full 32-char ID
  5. Pro subscription: ChatGPT Pro gives better image quality and faster generation

Security Note

This approach uses your actual Chrome browser session, so it inherits all your ChatGPT permissions and settings. No credentials are stored or transmitted - everything happens in your existing browser session.

版本历史

共 1 个版本

  • v1.0.0 当前
    2026-03-30 08:54 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

content-creation

Baidu Wenku AIPPT

ide-rea
使用百度文库 AI 智能生成 PPT,自动根据内容选择模板。
★ 66 📥 46,173
content-creation

AdMapix

fly0pants
广告情报与应用数据分析助手,支持搜索广告素材、分析应用排名、下载量、收入及市场洞察,用于广告素材和竞品分析。
★ 295 📥 136,464
content-creation

Humanizer

biostartechnology
消除AI写作痕迹,使文本更自然真实。基于维基百科"AI写作特征"指南,识别并修正夸张象征、宣传用语、肤浅-ing分析、模糊归因、破折号滥用、三项排比、AI词汇、负面平行结构及冗长连接词等模式。
★ 860 📥 199,663