← 返回
未分类 中文

Data Validator Pro

Data quality validation and profiling toolkit for tabular data. Use when checking data completeness, detecting anomalies, validating schemas, profiling datas...
Data quality validation and profiling toolkit for tabular data. Use when checking data completeness, detecting anomalies, validating schemas, profiling datas...
kaiyuelv kaiyuelv 来源
未分类 clawhub v1.0.0 1 版本 99708.5 Key: 无需
★ 0
Stars
📥 342
下载
💾 0
安装
1
版本
#latest

概述

Data Quality Validator

Toolkit for validating and profiling tabular data quality.

Features

  • Schema validation - Check column types, constraints, and rules
  • Completeness analysis - Missing value detection and reporting
  • Anomaly detection - Statistical outlier detection
  • Profiling - Summary statistics and distribution analysis
  • Constraint checking - Range checks, uniqueness, regex patterns

Quick Start

from scripts.data_profiler import DataProfiler
from scripts.schema_validator import SchemaValidator

# Profile a dataset
profiler = DataProfiler()
report = profiler.profile(df)  # pandas DataFrame
print(report["missing"])
print(report["outliers"])

# Validate against schema
schema = {
    "age": {"type": "int", "min": 0, "max": 150},
    "email": {"type": "str", "regex": r"^\S+@\S+\.\S+$"},
    "id": {"type": "int", "unique": True}
}
validator = SchemaValidator(schema)
errors = validator.validate(df)
for err in errors:
    print(err)

Scripts

  • scripts/data_profiler.py - Dataset profiling and summary stats
  • scripts/schema_validator.py - Schema-based validation engine
  • scripts/anomaly_detector.py - Statistical anomaly detection

References

  • references/validation_rules.md - Common validation patterns

版本历史

共 1 个版本

  • v1.0.0 当前
    2026-05-08 13:41 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

office-efficiency

ClawHub Automation

kaiyuelv
ClawHub零代码跨生态自动化Skill | No-code cross-platform automation for ClawHub with WeChat, DingTalk, Feishu, WPS integration
★ 0 📥 920
data-analysis

Smart Crawler

kaiyuelv
智能爬虫工具 - 企业级数据采集与反爬虫处理 | Smart Web Crawler - Enterprise data collection with anti-detection
★ 2 📥 1,822
office-efficiency

LocalDataAI

kaiyuelv
ClawHub AI 私有数据本地处理 Skill - 纯离线、不上云、数据不出域的本地 AI 文件处理工具 | Local private AI data processing with offline models, supportin
★ 0 📥 1,416