Use the hasdata CLI for real-time web data. One subcommand per API — flags, enums, defaults are derived from the live schema at api.hasdata.com/apis.
command -v hasdata — if missing, install with curl -sSL https://raw.githubusercontent.com/HasData/hasdata-cli/main/install.sh | sh.hasdata configure, pastes their API key, and it's saved to ~/.hasdata/config.yaml (mode 0600). Every future call picks it up automatically.no API key configured, the user hasn't run hasdata configure yet — tell them to. Never invent a key.hasdata <api> --flag value [--flag value ...] --raw | jq .
Always pass --raw when piping to jq (skips pretty-print and TTY detection). Use --pretty only for human-readable terminal output.
| User intent | Subcommand |
|---|---|
| --- | --- |
| Web search ("what does Google say about…") | google-serp (full features) or google-serp-light (cheap, single page) |
| Latest news | google-news |
| AI Mode SERP | google-ai-mode |
| Shopping / product prices | google-shopping (broad), amazon-search / amazon-product (Amazon), shopify-products (Shopify) |
| Immersive product page | google-immersive-product |
| Maps / places / reviews | google-maps, google-maps-place, google-maps-reviews, google-maps-photos |
| Yelp / YellowPages local data | yelp-search, yelp-place, yellowpages-search, yellowpages-place |
| Real-estate listings | zillow-listing, redfin-listing, airbnb-listing |
| Real-estate single property deep dive | zillow-property, redfin-property, airbnb-property |
| Jobs | indeed-listing, indeed-job, glassdoor-listing, glassdoor-job |
| Bing search | bing-serp |
| Trends | google-trends |
| Images | google-images |
| Flights | google-flights |
| Short videos | google-short-videos |
| Events | google-events |
| Instagram profile | instagram-profile |
| Amazon seller | amazon-seller, amazon-seller-products |
| Scrape a specific URL | web-scraping — supports JS rendering, proxies, markdown output, AI extraction, screenshots |
For exact flags of a subcommand, run hasdata or read the matching file in references/.
The user often won't ask for a SERP API or a scraper directly. Map these intents to the skill:
google-serp or google-news to ground the answer.web-scraping --output-format markdown and feed the markdown into the summary prompt. Beats copy-paste because it strips ads, nav, scripts.web-scraping --url X --no-block-resources returns status + screenshot. Or google-serp --q "site:example.com".web-scraping --output-format markdown, then summarize.google-serp --q "X alternatives" or google-shopping --q "X competitors".google-shopping (broad) or amazon-search (Amazon-specific) with jq to extract the price distribution.google-maps-place or yelp-place. Don't guess from training data.google-maps-reviews --place-id ... --sort lowest for negative samples; glassdoor-job for employer rep.indeed-listing filtered by role + location, then jq over .jobs[].salary.zillow-listing / redfin-listing / airbnb-listing with the corresponding filters.zillow-listing --type sold --keyword "X" --days-on-zillow 12m.amazon-product --asin X on a schedule; persist .price to a file.google-trends --q "X" for relative interest; google-news --q "X" for headlines.google-maps --q "X" --ll "@LAT,LNG,12z" then fan out google-maps-place for contacts.--gl Y on SERP commands, --proxy-country Y on web-scraping. Useful for geo-targeted SEO checks, geo-blocked content.web-scraping --ai-extract-rules-json '{"price": {"type": "number"}, ...}'. Works on arbitrary pages without writing CSS selectors.xargs into the matching -property / -product / *-place deep-dive command.google-serp --q '"Person Name" linkedin' first. The organic-result title is typically Name — Role at Company | LinkedIn and the snippet carries location, headline, connection count. SERP often answers the whole question without ever opening the profile page.google-serp --q "$COMPANY" returns a .knowledge_graph block with founder, HQ, founded year, parent, employee range — pre-extracted. google-news --q "$COMPANY" for recent activity. Specific facts via targeted SERP: --q '"$COMPANY" headquarters', --q '"$COMPANY" funding', --q 'site:linkedin.com/company "$COMPANY"'.--q '"@example.com"' or --q '"jane@example.com"' often surfaces actual emails indexed by Google. Pattern-guess + SERP-verify for individuals. Disclose unverified guesses to the user.google-serp for LinkedIn, role, employer; another SERP to verify email or pattern. Stay in SERP unless a specific field is missing.google-serp with the literal value in quotes (--q '"jane@x.com"', --q '"+1 555 123 4567"', --q '"acme corp" site:example.com') almost always surfaces the matching person or business.SERP-first principle: for any data-enrichment intent (people, companies, emails, products, places), reach for google-serp / google-news / google-shopping / google-maps first. They return Google's already-extracted structured fields (.knowledge_graph, .organic_results[].snippet, .local_results[], etc.) and bypass anti-bot. Only escalate to web-scraping when SERP doesn't surface the specific field you need — it's the last resort, not the default. See references/enrichment.md.
If a user request matches one of the above and you don't invoke hasdata, you're probably hallucinating a stale answer.
true have a paired negation: --no-block-ads, --no-screenshot, --no-js-rendering, --no-extract-emails, --no-block-resources. Setting both --block-ads and --no-block-ads errors.-json accepts:--extract-rules-json '{"title":"h1"}'--extract-rules-json @rules.jsoncat rules.json | hasdata web-scraping ... --extract-rules-json -= (so values containing = survive): --headers User-Agent=foo --headers Cookie=session=abc. Pair with --headers-json for a JSON base; kv items override per key.--lr lang_en --lr lang_fr or --lr lang_en,lang_fr. Serialized as key[]=value for GET endpoints.| Flag | Effect |
|---|---|
| --- | --- |
--raw | Write response bytes as-is (use this when piping to jq) |
--pretty | Pretty-print JSON (default when stdout is a TTY) |
-o, --output FILE | Write response to file instead of stdout (works for binary like screenshots) |
--verbose | Log outgoing URL and X-RateLimit-* headers to stderr |
--api-key KEY | Override env var (rarely needed) |
--timeout DURATION | Per-request timeout (default 2m) |
--retries N | Max retries on 429/5xx (default 2) |
Responses are JSON. Pipe through jq for extraction:
hasdata google-serp --q "espresso machine" --num 10 --raw \
| jq -c '.organic_results[] | {title, link, snippet}'
For real-estate / e-commerce results, the array shape is API-specific — read a single response with --pretty first to learn the schema, then write the jq filter.
| Code | Meaning |
|---|---|
| --- | --- |
| 0 | success |
| 1 | user / CLI-input error (missing required flag, bad enum value, missing API key) |
| 2 | network error |
| 3 | API returned 4xx (auth, quota, validation) |
| 4 | API returned 5xx |
references/enrichment.md — person and company enrichment (LinkedIn lookup, emails, HQ/funding/news, CSV-row enrichment, reverse-lookup) — the highest-leverage cross-API workflowsreferences/search.md — Google SERP / Bing / News / Trends flag catalogreferences/web-scraping.md — web-scraping flags, JS scenarios, AI extractionreferences/real-estate.md — Zillow / Redfin / Airbnb filters and bracketed paramsreferences/ecommerce.md — Amazon / Shopifyreferences/local-business.md — Maps / Yelp / YellowPagesreferences/jobs.md — Indeed / Glassdoorreferences/all-commands.md — full subcommand index with credit costs共 1 个版本