InfoSeek performs comprehensive web research on any subject (person, organization, product) across multiple search engines, deduplicates results, extracts clean content, and archives everything with full metadata in organized folders.
Before executing a search task, verify these skills are installed:
import os
from pathlib import Path
workspace = os.environ.get('OPENCLAW_WORKSPACE')
skills_dir = Path(workspace) / 'skills'
required = ['baidu-search', 'tavily', 'Multi-Search-Engine', 'agent-browser-clawdbot-0.1.0']
missing = [s for s in required if not (skills_dir / s).exists()]
If any are missing, instruct the user to install them:
openclaw skills install baidu-search
openclaw skills install tavily-search
openclaw skills install multi-search-engine
```bash
python scripts/infoseek_helper.py create-folder "
```
Execute searches across all available engines. Each engine runs independently.
Use the baidu-search skill:
" " Use tavily_search tool:
query: "<subject> <background_context>"
search_depth: advanced
max_results: 50
Use the multi-search-engine skill across multiple engines simultaneously.
For discovered URLs, use the browser tool to:
Run URL deduplication on all collected results:
python scripts/infoseek_helper.py deduplicate "<temp_results_file>"
The script normalizes URLs (remove www, tracking params, unify http/https, remove trailing slashes) and checks against the SQLite database to skip duplicates.
For each unique URL:
```bash
python scripts/infoseek_helper.py generate-filename \
--date "
```
Format: YYYYMMDD-title-website.ext
```bash
python scripts/infoseek_helper.py save-content \
--folder "
--website "
--title "
--content "
" --task "```
```bash
python scripts/infoseek_helper.py add-url \
--url "
```
Output a summary when complete:
InfoSeek Task Report
====================
Subject: {query}
Engines used: {engines}
Total found: {total} | Duplicates skipped: {dupes} | New archived: {new}
Files saved: {count}
Location: {path}
Database records: {db_total}
Format: YYYYMMDD-title-website.ext
<>:"/\|?*)If filename exists, append 8-char hash to prevent overwrites.
All formats include full metadata (URL, website, source, date, title, author, editor) plus body content.
{workspace}/
├── infoseek-archives/
│ ├── <subject_1>/
│ │ ├── 20260404-title-website.md
│ │ └── ...
│ └── <subject_2>/
└── infoseek/
├── infoseek.db # SQLite dedup database
├── infoseek.log # Operation log
└── backups/
Strict data retention — no permanent deletes without confirmation.
| Operation | Confirmation | Method |
|---|---|---|
| ----------- | ------------- | -------- |
| Bulk folder delete | Required | Move to recycle bin |
| Single file delete | Required | Move to recycle bin |
| Dedup skip | Automatic | Skip only, no delete |
| Database cleanup | Required | Mark as deleted |
Process:
Never:
Override defaults in task instructions:
{workspace}/infoseek-archives/, specify custom path| Problem | Solution |
|---|---|
| --------- | ---------- |
| Missing search skill | openclaw skills install |
| Date extraction fails | Check page metadata; use 00000000 for unknown |
| Encoding errors | Ensure UTF-8; on Windows enable Unicode UTF-8 in region settings |
| Database corruption | python scripts/infoseek_helper.py restore-backup |
| Version | Date | Notes |
|---|---|---|
| --------- | ------ | ------- |
| 2.0.0 | 2026-04-07 | Full rewrite: SQLite dedup, URL normalization, HTML parsing, multi-engine integration |
| 1.0.0 | 2026-04-06 | Initial version (deprecated) |
共 1 个版本
暂无安全检测报告