← 返回
未分类 Key

api-ingestion-connectors

Connect to external APIs and ingest data into graph-ready structures for ETL pipelines and knowledge graph construction.
Connect to external APIs and ingest data into graph-ready structures for ETL pipelines and knowledge graph construction.
yjkj999999
未分类 community v1.0.0 1 版本 100000 Key: 需要
★ 0
Stars
📥 17
下载
💾 0
安装
1
版本
#latest

概述

API Ingestion Connectors

Ingest data from external APIs into graph-ready formats for knowledge graph construction.

This skill retrieves data from diverse API sources and prepares it for transformation into graph-ready structures such as nodes, relationships, and triples.

Quick Start

Use When

  • Ingesting data from REST APIs
  • Querying GraphQL endpoints
  • Integrating external services into data pipelines
  • Pulling data from SaaS platforms
  • Transforming API responses into graph datasets
  • Building real-time knowledge graph updates

Inputs

  • API endpoint URLs
  • Authentication credentials
  • Request parameters and headers
  • Pagination configuration
  • Response format specifications
  • Transformation mappings

Outputs

  • JSON/CSV datasets
  • Graph-ready node/edge structures
  • RDF triples
  • Connector configurations
  • ETL pipeline definitions

Example

Input API Configuration:

Endpoint: https://api.example.com/users
Method: GET
Auth: Bearer Token
Pagination: page-based, 30 items per page

Generated Output:

{
  "nodes": [
    {"id": "user_1", "type": "Person", "name": "Alice", "email": "alice@example.com"},
    {"id": "org_1", "type": "Organization", "name": "Acme Corp"}
  ],
  "edges": [
    {"source": "user_1", "target": "org_1", "relation": "WORKS_AT"}
  ]
}

Supported API Types

1. REST APIs

Connect to standard HTTP REST endpoints with flexible authentication and pagination

type: rest
endpoint: https://api.example.com/resource
method: GET|POST|PUT|DELETE
response_format: json|xml|csv

2. GraphQL APIs

Query GraphQL endpoints with structured query definitions

query {
  users {
    id
    name
    email
    organization {
      name
    }
  }
}

3. OAuth-Protected APIs

Authenticate using OAuth 2.0 flows (authorization code, client credentials)

auth_type: oauth2
client_id: ${CLIENT_ID}
client_secret: ${CLIENT_SECRET}
token_endpoint: https://api.example.com/oauth/token

4. API Key Authentication

Simple API key-based authentication

auth_type: api_key
key_param: X-API-Key
key_value: ${API_KEY}

5. Bearer Token Authentication

OAuth 2.0 bearer token authentication

auth_type: bearer
token: ${ACCESS_TOKEN}

Pagination Strategies

Offset/Limit Pagination

type: offset
param_offset: offset
param_limit: limit
start_at: 0
page_size: 20

Page-Based Pagination

type: page
param_page: page
page_size: 30
start_at: 1

Cursor-Based Pagination

type: cursor
cursor_param: after
next_cursor_field: pageInfo.endCursor
has_next_field: pageInfo.hasNextPage

Execution Steps

  1. Validate Configuration – Check endpoint, auth, and parameters
  2. Authenticate – Obtain credentials and tokens
  3. Make Request – Execute HTTP/GraphQL request
  4. Handle Pagination – Fetch all pages/results
  5. Parse Response – Extract and validate response data
  6. Transform Data – Convert to graph-ready format
  7. Generate Output – Create nodes, edges, or triples
  8. Feed to Pipeline – Pass to downstream transformation skills

Output Formats

Node-Edge Structure

{
  "nodes": [{"id": "...", "type": "...", "properties": {...}}],
  "edges": [{"source": "...", "target": "...", "type": "...", "properties": {...}}]
}

Graph Triples (RDF)

:entity1 :relationType :entity2 .
:entity1 :property "value" .

CSV Export

node_id,node_type,node_name,property1,property2

Error Handling

The connector should handle:

  • Network Errors – Retry logic with exponential backoff
  • Authentication Errors – Token refresh, credential validation
  • Rate Limiting – Backoff and request throttling
  • Malformed Responses – Schema validation and error reporting
  • Timeouts – Connection and read timeout handling

Example retry strategy:

retry:
  max_attempts: 3
  backoff_factor: 2
  initial_delay: 1s
  max_delay: 60s
  retryable_status_codes: [429, 500, 502, 503, 504]

Recommended Libraries

  • HTTP Clients: requests, httpx, aiohttp
  • GraphQL: gql, graphene, strawberry
  • OAuth: authlib, oauthlib
  • Data Validation: pydantic, jsonschema
  • Data Transformation: pandas, polars

Best Practices

✓ Respect API rate limits and terms of service

✓ Implement exponential backoff for retries

✓ Validate response schemas before processing

✓ Handle and log errors appropriately

✓ Cache results when possible

✓ Normalize and deduplicate entities

✓ Secure credentials (use environment variables)

✓ Monitor API changes and versioning

✓ Implement request timeout handling

✓ Document API assumptions and requirements

Integration with Downstream Skills

The ingested data feeds into:

  • JSON → Triples Converter – Transform to RDF
  • CSV → Graph Loader – Load into graph database
  • Text → Entity/Relation Extractor – Extract structured knowledge
  • ETL Pipeline Generator – Orchestrate full workflows
  • Schema Validation – Validate against graph schema

References

See connector-patterns.md for detailed API connector patterns and example-connectors.md for complete connector examples.


Version: 1.0.0

版本历史

共 1 个版本

  • v1.0.0 从ClawHub迁移发布 当前
    2026-06-07 11:11 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

dev-programming

Mcporter

steipete
使用 mcporter CLI 直接列出、配置、认证及调用 MCP 服务器/工具(支持 HTTP 或 stdio),涵盖临时服务器、配置编辑及 CLI/类型生成功能。
★ 197 📥 68,101
design-media

agnes-image-gen

user_15292d5a
使用 Agnes AI 的图片生成模型生成图片,支持文生图(agnes-image-2.1-flash)和图生图(agnes-image-2.0-flash)。支持自定义 API Key,用户可使用自己的 Agnes Key。优化重点:降低
★ 1 📥 218
dev-programming

CodeConductor.ai

larsonreever
AI驱动平台,提供快速全栈开发、智能体、工作流自动化及低代码AI集成的可扩展产品创建。
★ 79 📥 182,775