Scrask is a screenshot-to-intent parser. The user sends a screenshot via whatever chat surface
they have wired into OpenClaw (Telegram, iMessage, Slack, etc.). Scrask:
destination: "calendar" → calctl / accli / apple-calendar / brainz-calendar / gcal-pro / etc.destination: "task" → apple-reminders / things-mac / notion / etc.Scrask never writes to a store directly. No service account JSON, no OAuth, no API keys for the
calendar/task layer — that's the destination skill's job.
Scrask is invoked in two ways. The platform tries explicit invocation first; if no alias matches, it falls back to the implicit trigger conditions.
If the user message begins with any of these aliases (case-insensitive, with or without a @ or / prefix), the platform dispatches to Scrask regardless of the implicit conditions below:
scraskscrask thisscreenshotscreenshot to calendarExamples that force-route to Scrask:
scrask this (with an attached image)@scrask (with an attached image)/scrask (with an attached image)screenshot to calendar (with an attached image)When invoked explicitly with no image attached, Scrask responds with a brief prompt asking the user to attach a screenshot, then stops. Do not run the parser without an image.
The OpenClaw agent reads the incoming message and activates Scrask when:
Do not activate (implicitly) for:
The implicit path is the one users will hit by default. The explicit aliases exist for two cases:
scrask this instead of resending.Reply on the user's current chat surface so they know the skill is working:
> "📸 Got it — analyzing your screenshot..."
python3 {baseDir}/scripts/scrask_bot.py \
--image-path "<path-to-temp-image>" \
--provider "$CONFIG_VISION_PROVIDER" \
--timezone "$CONFIG_TIMEZONE" \
--confidence-threshold "$CONFIG_CONFIDENCE_THRESHOLD" \
--actionable-threshold "$CONFIG_ACTIONABLE_THRESHOLD" \
--type-threshold "$CONFIG_TYPE_THRESHOLD" \
--field-threshold "$CONFIG_FIELD_THRESHOLD"
The script reads credentials from the environment — never pass them on the command line.
In default auto mode it routes by what is available:
GEMINI_API_KEY set → Gemini-first with Claude fallback (cheap + fast path).ANTHROPIC_API_KEY set (no Gemini key) → Claude only. OPENCLAW_VISION_PROVIDER, OPENCLAW_VISION_KEY, and optional OPENCLAW_VISION_MODEL.
So the skill works out of the box for any OpenClaw user with a vision-capable LLM
configured at the platform level. Bringing your own Gemini key only adds the cost-and-speed
optimisation on top.
The script returns JSON with:
success — whether parsing workedno_actionable_content — true if nothing actionable was foundactionable_confidence — 0.0–1.0, how sure the parser is the screenshot is actionableneeds_actionable_confirmation — true if actionable_confidence is in the maybe band;the bot should confirm "is this actually an event or task?" before dispatching
items[] — one entry per detected item with:type, destination, confidence (legacy aggregate), type_confidenceconfidences{} — per-field 0.0–1.0 scores (title, date, time, location, participants, description, priority, …)
needs_confirmation — true when there is at least one outstanding clarificationclarifications[] — targeted questions to ask the user, e.g. { "field": "time", "question": "What time is dinner with Priya?", "reason": "low_confidence" }
title, date, time, location, participants, etc.)summary_text — chat-ready preview of what was found; send this verbatim, do not rephrasescreenshot_summary, parse_notes — contextIf no_actionable_content is true:
Silently ignore the screenshot — or, if the user clearly meant for scrask to act on it,
reply with the summary_text field (which is a polite "couldn't find anything" message).
If success is true:
Send the summary_text value back to the user on the same chat surface. Then process each item.
For every item in items[]:
If needs_actionable_confirmation: true (top level):
Send summary_text (which already opens with "Is this actually an event or task?") and wait for
the user. On "yes", proceed item-by-item below. On "no", reply "Got it, skipped ✓" and stop.
For each item — if needs_confirmation: false (no outstanding clarifications):
Invoke the appropriate destination skill without asking the user first.
destination: "calendar" → invoke the user's installed calendar skill. Preference order: calctl → accli → apple-calendar → brainz-calendar → gcal-pro → first available.
destination: "task" → invoke the user's installed task skill. Preference order: apple-reminders → things-mac → notion → first available.
Pass the item fields (title, date, time, end_time, end_date, location, participants,
description, recurrence, online_link, etc.) to whatever creation command that skill exposes.
If end_date is present and different from date, treat the item as a multi-day event.
For each item — if needs_confirmation: true:
The clarifications[] array lists the specific things to ask. Each entry has:
field — which field needs clarification (e.g. "time", "date", "type")question — the user-facing question (already pre-formatted with the item title)reason — "missing" (value is null) or "low_confidence" (extracted but uncertain) or "low_type_confidence" (unsure whether this is a calendar event or a task)
The summary_text already renders these as a bullet list. Ask the user the questions in order
and patch the corresponding fields with their replies. Once every clarification is resolved,
route the item to the destination skill as above. If the user says skip at any point, drop
the item and confirm "Got it, skipped ✓".
For the special case of field: "type", the user's reply determines whether the item routes to
calendar or task — update destination accordingly before dispatch.
After each destination skill returns, relay a one-line confirmation to the user. Examples:
📅 Added to Calendar via calctl: Team Standup — 2026-03-01 at 09:00🔔 Added to Reminders: Pay electricity bill (due 2026-02-28)✅ Added to Things: Send Sandip my resumeIf the destination skill errors, surface the error and ask whether to retry with a different destination.
| Scenario | Behavior |
|---|---|
| --- | --- |
| Single screenshot has both an event and a task | Process each independently; route to its own destination. |
| Event implies a prep step (e.g. dinner at a restaurant → book table) | The parser emits BOTH an event and a prep reminder. Inferred fields on the prep reminder land in the 0.65–0.80 band, so most prep reminders hit needs_confirmation: true with targeted clarifications (typically time and date). |
| Multi-day event (trip, conference) | end_date is set and differs from date. Pass both to the calendar skill (e.g. calctl add --date --end-date --all-day). |
| Rescheduled / cancelled event | Parser extracts the NEW date; parse_notes flags it as a reschedule. Confirm with user before overwriting any existing entry. |
| Screenshot is in Hindi, Tamil, or another language | Title and description are already in English; language holds the ISO code. Save as-is. |
| Recurring event ("every Monday") | Pass recurrence and recurrence_day to the calendar skill. |
| Date has already passed | Flag in the reply: "⚠️ This date has already passed. Save anyway?" |
| Screenshot of someone's calendar | already_in_calendar_hint: true — reply: "Looks like this is already in your calendar 🗓️" and skip. |
| No calendar / task skill installed | Reply with the missing-skill hint and stop. |
| Zoom/Meet link found | Pass online_link to the calendar skill; it should set both location and description. |
| Meme / non-actionable screenshot | no_actionable_content: true — ignore silently unless user clearly asked for action. |
{
"skills": {
"entries": {
"scrask-bot": {
"enabled": true,
"env": {
// Both keys are OPTIONAL in v4.2+. Without either, Scrask uses
// OpenClaw's configured vision LLM via the platform-injected
// OPENCLAW_VISION_* env vars. Setting GEMINI_API_KEY opts into
// the cheap+fast Gemini routing. Setting ANTHROPIC_API_KEY adds
// Claude as a fallback (or as the primary if no Gemini key).
"GEMINI_API_KEY": "AIza-your-gemini-key",
"ANTHROPIC_API_KEY": "sk-ant-your-key-here"
},
"config": {
"vision_provider": "auto",
"fallback_threshold": 0.60,
"timezone": "Asia/Kolkata",
"confidence_threshold": 0.75,
"actionable_threshold": 0.70,
"type_threshold": 0.70,
"field_threshold": 0.70
}
}
}
}
}
ANTHROPIC_API_KEY is optional. Without it, auto mode runs Gemini only.
image:read — to access the screenshot from the chat surface.network:outbound — to call the vision model API (Gemini and optionally Claude).chat:reply — to send confirmation messages back via the user's chat surface.共 2 个版本