← 返回
开发者工具

Data quality & reconciliation with exception

Reconciles data sources using stable identifiers (Pay Number, driving licence, driver card, and driver qualification card numbers), producing exception reports and “no silent failure” checks. Use when you need weekly matching with explicit reasons for non-joins and mismatches.
利用稳定标识符(工资号、驾驶证、驾驶员卡及驾驶员资格证号)核对数据源,生成异常报告并进行“无静默失败”检查。适用于需要每周匹配并明确说明未关联及不匹配原因的场景。
kowl64
开发者工具 clawhub v1.0.0 1 版本 98924.7 Key: 无需
★ 3
Stars
📥 4,724
下载
💾 453
安装
1
版本
#latest

概述

Data quality & reconciliation with exception reporting and no silent failure

PURPOSE

Reconciles data sources using stable identifiers (Pay Number, driving licence, driver card, and driver qualification card numbers), producing exception reports and “no silent failure” checks.

WHEN TO USE

  • TRIGGERS:
  • Reconcile these two data sources and produce an exceptions report with reasons.
  • Match names and payroll numbers across files and flag anything that does not join.
  • Build a ‘no silent failure’ check that stops the pipeline if counts do not match.
  • Create a weekly variance report for missing records, duplicates, and date gaps.
  • Design a data quality scorecard with thresholds and red flags.
  • DO NOT USE WHEN…
  • You need open-ended fuzzy matching without acceptance criteria.
  • There are no stable identifiers in any source.

INPUTS

  • REQUIRED:
  • At least two datasets (CSV/XLSX) with Pay Number and/or driver document numbers.
  • Which fields must match (e.g., Name, expiry date).
  • OPTIONAL:
  • Normalization rules (case, spaces, punctuation).
  • Thresholds for gates/scorecard (max % missing, etc.).
  • EXAMPLES:
  • Payroll export + compliance register
  • Two weekly exports from different systems

OUTPUTS

  • Reconciliation plan (matching rules, normalization, join strategy).
  • Exceptions report spec (CSV columns + reason codes) and variance checks.
  • Optional artifacts: assets/exceptions-report-template.csv + references/matching-rules.md.

Success = every record is categorized (matched/missing/duplicate/mismatch/invalid) with an explicit reason; pipelines stop on anomalies.

WORKFLOW

  1. Confirm sources and key priority (Pay Number → Driver Card → Driving Licence → DQC).
  2. Normalize columns:
    • trim spaces; standardize case; strip common punctuation for document numbers.
  3. Validate keys:
    • flag blanks/invalid formats; identify duplicates per source.
  4. Join:
    • exact join on Pay Number; then attempt secondary joins only for remaining unmatched items.
  5. Produce exception categories with reasons:
    • Missing in A/B, Duplicate key, Field mismatch, Invalid key.
  6. “No silent failure” gates:
    • counts within tolerance; unmatched rate below threshold; duplicate spikes flagged.
  7. STOP AND ASK THE USER if:
    • columns are not mapped,
    • multiple competing IDs exist with no priority,
    • expected tolerances are unspecified.

OUTPUT FORMAT

exception_type,reason,source_a_id,source_b_id,pay_number,name,field,source_a_value,source_b_value

Reason codes: MISSING_IN_A, MISSING_IN_B, MISMATCH, DUPLICATE_KEY, INVALID_KEY.

SAFETY & EDGE CASES

  • Read-only by default; don’t auto-edit source data. Route exceptions to review.
  • Deterministic matching rules first; avoid fuzzy matching unless explicitly requested.
  • Always produce an exceptions report; never drop unmatched rows.

EXAMPLES

  • Input: “Payroll vs compliance; match by Pay Number; flag name mismatch.”

Output: join plan + mismatch reasons + exceptions report schema.

  • Input: “Some rows have blank Pay Number.”

Output: secondary key matching + invalid-key exceptions for truly unmatchable rows.

版本历史

共 1 个版本

  • v1.0.0 当前
    2026-03-28 10:09 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

data-analysis

Excel weekly dashboards at scale

kowl64
设计可刷新的Excel仪表盘(含Power Query、结构化表格、数据验证及透视表)。适用于需要从文件更新且仅需极少手动操作的每周KPI工作簿。
★ 4 📥 8,572
developer-tools

Gog

steipete
Google Workspace 命令行工具,支持 Gmail、日历、云端硬盘、通讯录、表格和文档。
★ 920 📥 185,727
developer-tools

CodeConductor.ai

larsonreever
AI驱动平台,提供快速全栈开发、智能体、工作流自动化及低代码AI集成的可扩展产品创建。
★ 65 📥 179,842