← 返回
未分类 中文

Data Analysis

Analyze CSV/Excel files to extract insights, generate statistics, create charts, and produce summaries. Use when user wants to (1) upload or analyze spreadsh...
分析CSV/Excel文件,提取洞察、生成统计、创建图表并输出摘要。适用于用户需要上传或分析电子表格的场景。
di5cip1e
未分类 clawhub v1.0.0 1 版本 100000 Key: 无需
★ 0
Stars
📥 784
下载
💾 2
安装
1
版本
#latest

概述

Data Analysis Skill

Analyze data files (CSV, Excel) and produce actionable insights.

Quick Start

  1. Read the file - Use appropriate library:
    • CSV: csv module or pandas.read_csv()
    • Excel: pandas.read_excel() with openpyxl engine
  1. Explore the data - Get shape, columns, dtypes, missing values
  1. Generate insights - Calculate:
    • Descriptive stats (mean, median, mode, std, min, max)
    • Correlations between numeric columns
    • Value counts for categorical columns
    • Trends over time if date column exists
  1. Create visualizations - Use matplotlib:
    • Bar charts for categorical data
    • Line charts for time series
    • Histograms for distributions
    • Scatter plots for correlations
  1. Summarize - Write findings in plain English

Common Patterns

Sales Data

import pandas as pd

df = pd.read_csv('sales.csv')
summary = {
    'total_revenue': df['amount'].sum(),
    'avg_order': df['amount'].mean(),
    'top_products': df['product'].value_counts().head(5),
    'monthly_trend': df.groupby(pd.to_datetime(df['date']).dt.month)['amount'].sum()
}

Customer Data

demographics = df.groupby('segment').agg({
    'age': ['mean', 'median'],
    'income': ['mean', 'std'],
    'id': 'count'
})

Time Series

df['date'] = pd.to_datetime(df['date'])
monthly = df.resample('M', on='date')['value'].sum()

Output Format

Always include:

  1. Overview - What the data contains (rows, columns, date range)
  2. Key Metrics - Top 5-10 actionable numbers
  3. Insights - 3-5 bullet points of what the data reveals
  4. Visualizations - At least 2 charts for any dataset with 100+ rows
  5. Recommendations - Suggested next steps based on findings

Error Handling

  • Handle missing values: df.fillna(0) or df.dropna()
  • Handle date parsing: Use pd.to_datetime(..., errors='coerce')
  • Handle large files: Process in chunks for files >100MB

版本历史

共 1 个版本

  • v1.0.0 当前
    2026-05-03 04:19 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

N8n Workflow Builder

di5cip1e
将英文自动化请求转换为完整的、可部署的 N8N 工作流 JSON,包含触发器、操作、错误处理和设置指南。
★ 0 📥 552

N8N Workflow Builder

di5cip1e
将自然语言自动化请求转换为完整的、可部署的 N8N 工作流 JSON,用于业务流程自动化和集成。
★ 0 📥 565

File Organizer

di5cip1e
自动整理、分类和清理文件。适用于用户想要 (1) 整理下载或文件夹,(2) 按类型/日期/大小排序文件,(3) 查找...
★ 0 📥 422