Parse one or more screenshots into a machine-readable JSON schema with:
type (normalized UI element type)bbox_px and bbox_normtext (OCR/caption content when available)clickable flagscripts/operate_ui.py (click/type/key/hotkey/screenshot)scripts/operate_ui.py (find, wait)calibrate)skills/ui-element-ops/scripts/bootstrap_omniparser_env.sh "$PWD"
skills/ui-element-ops/scripts/run_parse_ui.sh /abs/path/to/1.jpeg
.elements.json .overlay.png skills/ui-element-ops/scripts/capture_and_parse.sh
scripts/bootstrap_omniparser_env.sh when .venv or OmniParser weights are missing.scripts/run_parse_ui.sh for standard parsing.total, clickable, by_type.python3 skills/ui-element-ops/scripts/operate_ui.py list --elements python3 skills/ui-element-ops/scripts/operate_ui.py find --elements --type button --text-contains login python3 skills/ui-element-ops/scripts/operate_ui.py wait --elements --state appear --text-contains continue python3 skills/ui-element-ops/scripts/operate_ui.py click --elements --id e_0001 python3 skills/ui-element-ops/scripts/operate_ui.py screenshot (defaults to user tmp dir)python3 skills/ui-element-ops/scripts/operate_ui.py calibrate --parsed-size --actual-size references/type_rules.example.json.scripts/parse_ui.py --help.--use-paddleocr only when paddleocr/paddlepaddle are installed.schema_version, pipeline, image, counts, elementsid, type, bbox_px, bbox_norm, text, clickable$HOME: keep temporary caches under /tmp (handled by run script).elements.json when possible.共 1 个版本