← 返回
开发者工具 Key

Parallel Enrichment

Bulk data enrichment via Parallel API. Adds web-sourced fields (CEO names, funding, contact info) to lists of companies, people, or products. Use for enrichi...
通过并行 API 进行批量数据丰富,自动为公司、个人或产品的列表添加网页来源字段(如 CEO 姓名、融资信息、联系方式等),用于完善数据。
normallygaussian normallygaussian 来源
开发者工具 clawhub v1.0.3 2 版本 99903.8 Key: 需要
★ 0
Stars
📥 2,076
下载
💾 3
安装
2
版本
#latest

概述

Parallel Enrichment

Bulk data enrichment that adds web-sourced fields to lists of companies, people, or products. Describe what you want in natural language.

When to Use

Trigger this skill when the user asks for:

  • "enrich this list with...", "add CEO names to...", "find funding for these companies..."
  • "look up contact info for...", "get LinkedIn profiles for..."
  • Bulk data operations on CSV files or lists
  • Adding web-sourced columns to existing datasets
  • Lead enrichment, company research, product comparison

Quick Start

# Inline data
parallel-cli enrich run \
  --data '[{"company": "Google"}, {"company": "Microsoft"}]' \
  --intent "CEO name and founding year" \
  --target output.csv

# CSV file
parallel-cli enrich run \
  --source-type csv --source input.csv \
  --target output.csv \
  --intent "CEO name and founding year"

CLI Reference

Basic Usage

parallel-cli enrich run [options]

Note: There is no --json flag for enrich. Results are written to the target file.

Common Flags

FlagDescription
-------------------
--data ""Inline JSON array of records
--source-type csvSource file type
--source Input CSV file path
--target Output CSV file path
--source-columns ""Describe input columns
--enriched-columns ""Specify output columns
--intent ""Natural language description of what to find
--processor Processing tier (see table below)

Processor Tiers

ProcessorUse Case
---------------------
lite-fastSimple lookups
base-fastBasic enrichment
core-fastStandard enrichment
pro-fastDeep enrichment (default)
ultra-fastComplex multi-source enrichment

Examples

Inline data enrichment:

parallel-cli enrich run \
  --data '[{"company": "Stripe"}, {"company": "Square"}, {"company": "Adyen"}]' \
  --intent "CEO name, headquarters city, and latest funding round" \
  --target ./companies-enriched.csv

CSV file enrichment:

parallel-cli enrich run \
  --source-type csv \
  --source ./leads.csv \
  --target ./leads-enriched.csv \
  --source-columns '[{"name": "company_name", "description": "Company name"}]' \
  --intent "Find CEO name, company size, and LinkedIn company page URL"

With explicit output columns:

parallel-cli enrich run \
  --data '[{"name": "Sam Altman"}, {"name": "Satya Nadella"}]' \
  --source-columns '[{"name": "name", "description": "Person full name"}]' \
  --enriched-columns '[
    {"name": "current_company", "description": "Current company/employer"},
    {"name": "title", "description": "Current job title"},
    {"name": "twitter", "description": "Twitter/X handle"}
  ]' \
  --target ./people-enriched.csv

Using AI to suggest columns:

# First, get AI suggestions
parallel-cli enrich suggest \
  --source-type csv \
  --source ./companies.csv \
  --intent "competitor analysis data"

# Then run with suggested columns
parallel-cli enrich run \
  --source-type csv \
  --source ./companies.csv \
  --target ./companies-analysis.csv \
  --intent "competitor analysis: market position, key products, recent news"

Best-Practice Prompting

Intent Description

Write 1-2 sentences describing:

  • What specific fields you want to add
  • Context about the data (B2B companies, tech startups, etc.)
  • Any constraints (recent data, specific sources)

Good:

--intent "Find CEO name, total funding raised, and number of employees for B2B SaaS companies"

Poor:

--intent "Find stuff about these companies"

Source Column Descriptions

When using --source-columns, provide context:

[
  {"name": "company", "description": "Company name, may include Inc/LLC suffix"},
  {"name": "website", "description": "Company website URL, may be partial"}
]

Response Format

The CLI outputs:

  • A monitoring URL to track progress
  • Status updates as rows are processed
  • Final output written to target CSV

The target CSV contains:

  • All original columns from the source
  • New enriched columns as specified
  • A _parallel_status column indicating success/failure per row

Output Handling

After enrichment completes:

  1. Report the number of rows enriched
  2. Preview the first few rows: head -6 output.csv
  3. Share the full path to the output file
  4. Note any rows that failed enrichment

Configuration File

For complex enrichments, use a YAML config:

# enrich-config.yaml
source:
  type: csv
  path: ./input.csv
  columns:
    - name: company_name
      description: "Company legal name"
    - name: website
      description: "Company website URL"

target:
  type: csv
  path: ./output.csv

enriched_columns:
  - name: ceo_name
    description: "Current CEO full name"
  - name: employee_count
    description: "Approximate number of employees"
  - name: funding_total
    description: "Total funding raised in USD"

processor: pro-fast

Then run:

parallel-cli enrich run enrich-config.yaml

Running Out of Context?

For large enrichments, save results and use sessions_spawn:

parallel-cli enrich run --source-type csv --source input.csv --target /tmp/enriched-<topic>.csv --intent "..."

Then spawn a sub-agent:

{
  "tool": "sessions_spawn",
  "task": "Read /tmp/enriched-<topic>.csv and summarize the results. Report row count, success rate, and preview first 5 rows.",
  "label": "enrich-summary"
}

Error Handling

Exit CodeMeaning
--------------------
0Success
1Unexpected error (network, parse)
2Invalid arguments
3API error (non-2xx)

Common issues:

  • Row failures: Check _parallel_status column in output
  • Timeout: Use smaller batches or lower processor tier
  • Rate limits: Add delays between large enrichments

Prerequisites

Requires parallel-cli (installed and authenticated). If parallel-cli --version fails, or if a later command fails with an authentication error, tell the user to see https://docs.parallel.ai/integrations/cli and stop.

References

版本历史

共 2 个版本

  • v1.0.3 当前
    2026-05-08 12:11 安全 安全
  • v1.0.0
    2026-03-28 18:21 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

data-analysis

Data Analysis

ivangdavila
{"answer":"数据分析与可视化。查询数据库、生成报告、自动化电子表格,将原始数据转化为清晰可行的见解。适用于:(1) 您……"}
★ 204 📥 65,934
knowledge-management

Parallel Search

normallygaussian
通过Parallel API实现的AI驱动网络搜索,返回排序结果及LLM优化摘要。用于最新研究、事实核查和领域限定的搜索。
★ 0 📥 2,364
data-analysis

Tavily 搜索

jacky1n7
通过 Tavily API 进行网页搜索(Brave 替代方案)。当用户要求搜索网页、查找来源或链接,且 Brave 网页搜索不可用时使用。
★ 271 📥 99,706