> AI Agent Skill Quality Assurance — This skill adapts the CORE-EEAT framework to evaluate OpenClaw Skills, ensuring they deliver meaningful utility while maintaining security and reliability.
OpenClaw Skills are modular capability extensions for AI agents, defined by SKILL.md files with YAML frontmatter and prompt instructions. This skill evaluates skill quality through 80 standardized criteria across 8 core dimensions, generating comprehensive audit reports including utility scores, security assessments, and actionable improvement recommendations.
Core Transformation:
Every OpenClaw Skill consists of:
my-skill/
├── SKILL.md # Core definition (YAML + Markdown instructions)
├── scripts/ # Optional executable scripts
│ └── main.py
└── references/ # Optional configuration and resources
└── config.json
Key Components:
Use this skill when users request:
This skill can:
This skill supports 6 OpenClaw Skill types, each with different evaluation priorities:
When: Before installing any skill
Duration: 2-5 minutes
Items:
Deliverable: Metadata Validation Report
Failure: Do not install. Contact skill author or fix manually.
When: After metadata validation, before activation
Duration: 1-2 minutes
Items:
Deliverable: Gating Compatibility Report
Failure: Skill will not activate. Fix environment or choose alternative.
When: Before first execution
Duration: 3-5 minutes
Items:
Deliverable: Security Pre-Check Report
Failure: Do not execute. Review code or choose alternative.
When: During skill development or installation
Duration: 5-10 minutes
Items:
Deliverable: Prompt Quality Report
Failure: Skill may misbehave. Refine SKILL.md instructions.
When: For skills with executable scripts
Duration: 10-20 minutes
Items:
Deliverable: Script Security Audit
Failure: Security risk. Audit scripts or avoid skill.
When: During skill evaluation
Duration: 15-30 minutes
Items:
Deliverable: Utility Test Report
Failure: Skill doesn't deliver value. Consider alternatives.
When: Selecting between multiple skills for same task
Duration: 10-15 minutes
Items:
Deliverable: Comparative Analysis Report
Failure: Alternative skill may be better. Choose based on scores.
When: Critical skills, major updates, security reviews
Duration: 30-60 minutes
Items: All 80 items across all 8 dimensions
Deliverable: Comprehensive EEAT Audit Report
Failure: Address critical issues before production use.
User Request:
Check if this skill is safe to install:
clawhub install agent-browser
Response:
## EEAT Skill Security Audit: agent-browser
### Gate 1: Metadata Validation ✅
- [x] YAML frontmatter valid
- [x] Description clear: "Browser automation for web scraping"
- [x] Version: 1.2.3
- [x] Dependencies: playwright (verified safe)
### Gate 2: Gating Check ⚠️
- [x] Bins: playwright installed
- [x] Env: PLAYWRIGHT_BROWSERS_PATH set
- [ ] OS: macOS/Windows only (Linux limited support)
### Gate 3: Security Pre-Check ✅
- [x] T01: No command injection detected
- [x] T02: No data leakage risks
- [x] T03: Input validation on URLs
- [x] T04: Minimal permissions (browser control only)
### Security Score: 92/100
**Security Status**: CAUTION ⚠️
- Linux support limited
- Requires browser installation (system-level change)
### Recommendation
✅ **Install with Caution**
- Safe for macOS/Windows users
- Linux users should check compatibility
- Review permissions before first use
User Request:
Compare tavily-search vs web-search for web search
Response:
## Skill Comparative Analysis: Web Search
### Skill A: tavily-search
**Utility Score**: 85/100
- C05: High accuracy search results ✅
- C06: Clean, structured output ✅
- Exp01: Excellent usage examples ✅
- E01: AI-optimized for agents 🎯
**Security Score**: 95/100
- T01-T04: All checks pass ✅
- T07: No external dependencies ✅
**Overall**: 88/100 (Very Good)
### Skill B: web-search (bundled)
**Utility Score**: 70/100
- C05: Basic search functionality ✅
- C06: Output quality varies ⚠️
- Exp01: Limited examples ⚠️
- E01: Generic implementation ⚠️
**Security Score**: 90/100
- T01-T04: Safe but basic ✅
- T07: Uses external APIs ⚠️
**Overall**: 75/100 (Good)
### Recommendation
✅ **Choose tavily-search**
- +18 points higher utility
- AI-optimized for agent workflows
- Better structured output for downstream processing
- Higher security score
**Trade-off**: Requires API key setup vs. no setup for web-search
User Request:
Audit my custom skill: email-automation
[Paste SKILL.md content]
Response:
## EEAT Skill Audit: email-automation
### Overview
- **Skill Type**: Productivity
- **Total Score**: 62/100 (Fair)
- **Utility Score**: 75/100 (Good)
- **Security Score**: 45/100 (CAUTION ⚠️)
- **Veto Item**: ⚠️ T02 Triggered (Data Leakage Risk)
### Dimension Scores
| Dimension | Score | Rating | Weight | Weighted Score |
|-----------|-------|--------|--------|----------------|
| C — Correctness | 70/100 | Good | 30% | 21 |
| O — Organization | 65/100 | Fair | 5% | 3.25 |
| R — Referenceability | 50/100 | Fair | 25% | 12.5 |
| Exp — Experience | 80/100 | Good | 20% | 16 |
| Ept — Expertise | 55/100 | Fair | 15% | 8.25 |
| E — Exclusivity | 40/100 | Poor | 0% | 0 |
| A — Authority | 60/100 | Fair | 5% | 3 |
| T — Trust | 45/100 | Poor | 0% | 0 |
| **Weighted Total** | | | | **64** |
### Critical Issues (Veto Items)
⚠️ **T02: Data Leakage Risk**
**Issue**: Skill stores API credentials in plain text in SKILL.md
credentials:
smtp_password: "mypassword123" # ⚠️ SECURITY RISK
**Action**: Move credentials to environment variables
credentials:
smtp_password: "${SMTP_PASSWORD}" # ✅ SECURE
### Top 5 Priority Improvements
1. **T02 Data Leakage** — Remove hardcoded credentials
- Current: Fail | Potential Gain: 8 weighted points
- Action: Use environment variables for all secrets
2. **R02 Coverage** — Add error handling examples
- Current: Fail | Potential Gain: 6.25 weighted points
- Action: Document error scenarios and recovery
3. **Ept01 Documentation** — Improve prompt instructions
- Current: Partial | Potential Gain: 4.5 weighted points
- Action: Add step-by-step usage examples
4. **R03 Source Authority** — Verify email library security
- Current: Partial | Potential Gain: 3.75 weighted points
- Action: Audit nodemailer dependency for vulnerabilities
5. **O01 Structure** — Add scripts/ directory for complex logic
- Current: Partial | Potential Gain: 2.5 weighted points
- Action: Move complex operations to Python scripts
### Action Plan
#### Quick Wins (Fix immediately)
- [ ] Move all credentials to environment variables
- [ ] Add error handling documentation
#### Medium Investment (This week)
- [ ] Add comprehensive usage examples
- [ ] Implement proper logging in scripts
#### Strategic (Next sprint)
- [ ] Add test suite with edge cases
- [ ] Implement retry logic for failed sends
- [ ] Add HTML email support
### Recommendation
⚠️ **Do Not Install Until Fixed**
- Security risk (T02 veto) must be addressed
- After fixes, expected score: 78/100 (Good)
references/openclaw-skill-benchmark.md — Complete 80-item benchmark adapted for OpenClaw Skillsreferences/skill-security-checklist.md — Security-specific evaluation criteriareferences/utility-testing-guide.md — How to test skill utility and comparative valueworkflow-optimization-analysis.md — Adaptation strategy from code to skills| Aspect | Code Audit | Skill Audit |
|---|---|---|
| -------- | ----------- | ------------- |
| Primary Focus | Code correctness, maintainability | Utility, security, reliability |
| Security Emphasis | SQL injection, XSS | Command injection, data leakage, permissions |
| Evaluation Method | Static analysis + testing | Comparative utility + security probes |
| Output Format | Code quality report | Utility score + security status label |
| Key Metrics | Test coverage, complexity | Task completion, risk level |
| Veto Items | Security bugs, logic errors | Security vulnerabilities, data risks |
| Automation Level | High (linters, type checkers) | Medium (requires manual security review) |
| Comparative Analysis | Code vs. requirements | Skill vs. baseline/skills |
Based on OpenClaw's architecture and community best practices:
clawsec (ClawHub security scanner)共 1 个版本