← 返回
内容创作 Key 中文

PulpMiner Web Scraper - Convert Any Webpage to Realtime JSON API

Convert any webpage into structured JSON data using AI. Scrape websites, extract data into custom JSON schemas, and call saved APIs programmatically. Useful for web scraping, data extraction, content monitoring, lead generation, price tracking, and building data pipelines.
利用AI将任意网页转化为结构化JSON数据。支持抓取网站、按自定义JSON模式提取数据及编程调用已保存API。适用于网页抓取、数据提取、内容监控、线索生成、价格追踪及构建数据管道。
melvin2016
内容创作 clawhub v1.0.1 1 版本 99938.6 Key: 需要
★ 6
Stars
📥 1,508
下载
💾 99
安装
1
版本
#latest

概述

PulpMiner — AI Web Scraping & JSON API

PulpMiner converts any webpage into structured JSON using AI. You provide a URL and optionally a JSON template, and PulpMiner scrapes the page, runs it through an LLM, and returns clean structured data.

Authentication

All API calls require the apikey header:

apikey: <PULPMINER_API_KEY>

Get your API key from https://pulpminer.com/api — click "Regenerate Key" if you don't have one.

Core Workflow

PulpMiner works in two phases:

  1. Create a saved API — Configure a URL, scraper, LLM, and optional JSON template via the PulpMiner dashboard at https://pulpminer.com/api
  2. Call the saved API — Use the external endpoint with your API key to fetch structured JSON

Calling a Saved API

Static API (fixed URL)

curl -X GET "https://api.pulpminer.com/external/<apiId>" \
  -H "apikey: <PULPMINER_API_KEY>"

Returns JSON extracted from the configured webpage.

Dynamic API (URL with variables)

For APIs saved with template URLs like https://example.com/search?q={{query}}&page={{page}}:

curl -X POST "https://api.pulpminer.com/external/<apiId>" \
  -H "apikey: <PULPMINER_API_KEY>" \
  -H "Content-Type: application/json" \
  -d '{"query": "javascript frameworks", "page": "1"}'

The {{variable}} placeholders in the saved URL get replaced with the values you provide.

Response Format

Successful responses return:

{
  "data": { ... },
  "errors": null
}

Error responses return:

{
  "data": null,
  "errors": "Error message describing what went wrong"
}

Caching

  • API responses are cached for 24 hours by default
  • If cache is older than 15 minutes, PulpMiner serves the cached version while refreshing in the background
  • Cache can be disabled per-API in the dashboard settings

Configuration Options (Set in Dashboard)

When creating a saved API at https://pulpminer.com/api, you can configure:

OptionDescription
---------------------
URLThe webpage to scrape
JSON TemplateOptional JSON structure for the LLM to follow (e.g., {"name": "", "price": ""})
Render JSEnable for SPAs and JS-heavy pages (uses headless browser)
CSS SelectorExtract only a specific part of the page (e.g., .product-list, #main-content)
Extra InstructionsAdditional guidance for the AI (e.g., "Only extract items with prices above $50")
Dynamic URLEnable template variables in the URL with {{variable}} syntax
CacheToggle response caching on/off

Integration with Zapier

For async scraping in Zapier workflows:

# Static API
curl -X POST "https://api.pulpminer.com/external/zapier/get/<apiId>" \
  -H "apikey: <PULPMINER_API_KEY>" \
  -d '{"callbackURL": "https://hooks.zapier.com/..."}'

# Dynamic API
curl -X POST "https://api.pulpminer.com/external/zapier/post/<apiId>" \
  -H "apikey: <PULPMINER_API_KEY>" \
  -d '{"callbackURL": "https://hooks.zapier.com/...", "query": "value"}'

Returns 201 immediately. Sends scraped data to the callback URL when complete.

Integration with n8n

Verify authentication:

curl -X GET "https://api.pulpminer.com/external/n8n/auth" \
  -H "apikey: <PULPMINER_API_KEY>"

Then use the standard /external/ endpoints for data fetching.

Credits

  • Each API call costs 0.25–0.4 credits depending on the endpoint
  • JavaScript rendering adds 0.1 credits extra
  • New users get 5 free credits
  • Purchase more at https://pulpminer.com/credits

Tips

  • Use CSS selectors to narrow down the scraped content and improve accuracy
  • Provide a JSON template for consistent, predictable output structures
  • Enable JS rendering only when needed — static pages scrape faster and cost fewer credits
  • Use extra instructions to guide the AI (e.g., "Return dates in ISO 8601 format")
  • For monitoring use cases, keep caching enabled to reduce credit usage
  • Use the playground first to verify a URL is scrapable before saving an API config
  • Dynamic APIs are ideal for search pages, paginated content, and parameterized URLs

Links

  • Website: https://pulpminer.com
  • API Dashboard: https://pulpminer.com/api

版本历史

共 1 个版本

  • v1.0.1 当前
    2026-03-28 22:55 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

content-creation

Humanizer

biostartechnology
消除AI写作痕迹,使文本更自然真实。基于维基百科"AI写作特征"指南,识别并修正夸张象征、宣传用语、肤浅-ing分析、模糊归因、破折号滥用、三项排比、AI词汇、负面平行结构及冗长连接词等模式。
★ 861 📥 200,132
content-creation

YouTube

byungkyu
使用托管OAuth集成YouTube Data API,支持搜索视频、管理播放列表、获取频道数据及评论互动,适用于用户需要时使用此技能。
★ 142 📥 41,106
content-creation

AdMapix

fly0pants
广告情报与应用数据分析助手,支持搜索广告素材、分析应用排名、下载量、收入及市场洞察,用于广告素材和竞品分析。
★ 295 📥 136,537