← 返回
未分类 中文

Data Toolkit

Complete data conversion, validation, and cleaning toolkit. Convert between JSON/CSV/YAML/XML, validate schemas, clean duplicates and nulls. Essential utilit...
完整的数据转换、验证与清洗工具箱,支持 JSON/CSV/YAML/XML 互转,校验模式,去除重复和空值,必备实用工具。
atlasnexusops
未分类 clawhub v1.0.1 1 版本 100000 Key: 无需
★ 0
Stars
📥 342
下载
💾 0
安装
1
版本
#latest

概述

Data Toolkit

Complete data processing utilities for OpenClaw agents.

Features

Converters

  • JSON ↔ CSV - Bidirectional conversion with schema inference
  • JSON ↔ YAML - Clean formatting, comment preservation
  • JSON ↔ XML - Configurable root elements and attributes
  • CSV ↔ YAML - Direct conversion without intermediate steps
  • Multi-format batch conversion - Process entire directories

Validators

  • JSON Schema validation - Validate against JSON Schema specs
  • CSV structure validation - Check headers, columns, data types
  • Data type inference - Automatic type detection and validation
  • Custom rules - Define business logic validations

Cleaners

  • Duplicate removal - Smart deduplication with configurable keys
  • Null/empty handling - Remove or replace null values
  • Data normalization - Standardize formats (dates, numbers, strings)
  • Whitespace cleanup - Trim, collapse multiple spaces
  • Column operations - Remove, rename, reorder columns

Get Data Toolkit

🛒 Gumroad (€10): https://nexusatlas.gumroad.com/l/bsyacx

📦 ClawHub: https://clawhub.ai/skills/data-toolkit

MIT License — Python 3.8+, zero dependencies.

Usage

Convert Data

# JSON to CSV
./src/convert.py --input data.json --output data.csv --format csv

# CSV to JSON
./src/convert.py --input data.csv --output data.json --format json

# JSON to YAML
./src/convert.py --input data.json --output data.yaml --format yaml

# XML to JSON
./src/convert.py --input data.xml --output data.json --format json

# Batch conversion
./src/convert.py --input-dir ./raw --output-dir ./processed --format json

Validate Data

# Validate against JSON schema
./src/validate.py --input data.json --schema schema.json

# Validate CSV structure
./src/validate.py --input data.csv --check-headers --check-types

# Custom validation rules
./src/validate.py --input data.json --rules validation-rules.yaml

Clean Data

# Remove duplicates
./src/clean.py --input data.json --dedupe --key id

# Handle nulls
./src/clean.py --input data.csv --remove-nulls
./src/clean.py --input data.csv --replace-nulls "N/A"

# Normalize data
./src/clean.py --input data.json --normalize dates,numbers,strings

# Full cleanup pipeline
./src/clean.py --input messy.csv --dedupe --remove-nulls --normalize all --output clean.csv

API Usage (Python)

from data_toolkit import convert, validate, clean

# Convert
convert.json_to_csv('input.json', 'output.csv')
convert.csv_to_yaml('input.csv', 'output.yaml')

# Validate
is_valid = validate.json_schema('data.json', 'schema.json')
errors = validate.csv_structure('data.csv')

# Clean
clean.remove_duplicates('data.json', key='id')
clean.normalize_dates('data.csv', format='ISO8601')

Examples

See examples/ directory for complete workflows:

  • examples/etl-pipeline.sh - Full ETL workflow
  • examples/api-data-processing.py - API response processing
  • examples/batch-conversion.sh - Bulk file conversion

Installation

Dependencies are minimal and common:

  • Python 3.8+
  • PyYAML
  • pandas (optional, for advanced CSV operations)
pip install pyyaml pandas

Requirements

  • Node.js (for JSON/YAML parsing)
  • Python 3.8+
  • 10MB disk space

License

MIT

Support

Issues: https://github.com/forge-agent/data-toolkit

Docs: See docs/ directory

版本历史

共 1 个版本

  • v1.0.1 当前
    2026-05-07 19:44 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

data-analysis

AdMapix

fly0pants
AdMapix 原始数据层,提供广告创意、应用、排名、下载/收入及市场元数据。返回 AdMapix API 的结构化 JSON;调用方...
★ 297 📥 142,020
data-analysis

Data Analysis

ivangdavila
{"answer":"数据分析与可视化。查询数据库、生成报告、自动化电子表格,将原始数据转化为清晰可行的见解。适用于:(1) 您……"}
★ 211 📥 70,042
data-analysis

Stock Analysis

udiedrichsen
利用Yahoo Finance数据深度分析股票和加密货币。支持投资组合管理、关注列表与提醒、股息分析、八维度股票评分、热门趋势扫描(热点扫描器)及谣言/早期信号检测。适用于股票分析、投资组合追踪、财报反应、加密货币监控、热门股票发现及在主流
★ 280 📥 57,929