← 返回
未分类 中文

Phy Post Forensics

Analyzes your social media posts' linguistic and structural features to identify patterns that drive your highest engagement and generate a personalized cont...
分析你的社交媒体帖子的语言和结构特征,找出驱动最高参与度的模式,并生成个性化内容...
phy041 phy041 来源
未分类 clawhub v1.0.3 1 版本 100000 Key: 无需
★ 0
Stars
📥 165
下载
💾 0
安装
1
版本
#latest

概述

phy-post-forensics — "Why Did This Post Work?"

You post 50 times. 3 go viral. 47 die. Analytics tells you "what happened" (likes, views). This tool tells you why — which structural elements in your content drove engagement.

The Problem

Every analytics tool shows lagging indicators: impressions, likes, comments. None of them tell you:

  • Which hook type correlates with your best posts?
  • Do your top posts have more specific numbers?
  • Does sentence rhythm (mixing short + long) predict engagement?
  • Are your worst posts missing personal voice (first-person pronouns)?

This tool extracts 12 structural features per post, groups by performance tier, and outputs: "Your top posts share X, Y, Z. Your bottom posts all lack them."

Quick Start

# Analyze posts from JSON file
python3 ~/.claude/skills/phy-post-forensics/scripts/post_forensics.py --file my_posts.json

# Pipe from stdin
cat posts.json | python3 ~/.claude/skills/phy-post-forensics/scripts/post_forensics.py

# JSON output (for pipelines)
python3 ~/.claude/skills/phy-post-forensics/scripts/post_forensics.py --file posts.json --format json

Input Format

[
  {
    "text": "I built 101 OpenClaw skills in 30 days. Here's what happened...",
    "platform": "linkedin",
    "engagement_rate": 8.5,
    "impressions": 15000,
    "date": "2026-03-18"
  },
  {
    "text": "Another post...",
    "platform": "reddit",
    "engagement_rate": 1.2,
    "impressions": 500
  }
]

Required: text, engagement_rate. Optional: platform, impressions, date.

The 12 Features Extracted

#FeatureWhat It MeasuresWhy It Matters
---------------------------------------------
1Hook TypeFirst line classification: number_lead, question, contrarian, story_open, challenge, statementHook determines "stop scrolling" moment
2Word CountTotal wordsPlatform sweet spots vary (LinkedIn ~100-200, Twitter ~50-100)
3Sentence CountNumber of sentencesMore sentences ≠ better, but structure matters
4Sentence Length CVCoefficient of variation of sentence lengthsHigh CV = mixed rhythm (human). Low CV = monotone (AI)
5Question CountQuestions in body textQuestions drive comments and dwell time
6Specific NumbersCount of data points (%, $, metrics, years)Posts with data get 3-4x more reach (LinkedIn 2026 data)
7Personal Pronoun DensityI/my/we per 100 wordsPersonal voice = trust and engagement
8List FormattingBullets or numbered lists presentScannability drives dwell time
9Paragraph CountNumber of visual breaksShort paragraphs = mobile-friendly
10CTA TypeCall-to-action classification: question, action, share, comment, noneCTAs drive comments, comments drive distribution
11SentimentPositive / neutral / negative keyword analysisPositive content drives more shares
12Specificity ScoreProper nouns + numbers + tool names per 100 wordsSpecific > generic (3-4x reach difference)

How Analysis Works

  1. Extract 12 features from every post
  2. Tier posts by engagement: top 25%, middle 50%, bottom 25%
  3. Compare feature distributions across tiers
  4. Generate insights: which features differentiate top from bottom
  5. Output blueprint: actionable content template based on YOUR data

Example Output

==================================================================
  phy-post-forensics — Content Forensics Report
==================================================================
  Posts analyzed : 8
  Top tier       : 2 posts (avg 11.05% engagement)
  Bottom tier    : 2 posts (avg 0.40% engagement)
  Spread         : Top posts get 27.6x more engagement
==================================================================

🔍  Key Insights (8 patterns found):

  1. 🔴 [HIGH] Specific Data Points
     Top posts: 5.5 numbers/post
     Bottom:    0.0 numbers/post
     → Include specific numbers — posts with data get 3-4x reach

  2. 🔴 [HIGH] Specificity Score
     Top posts: 18.9/100w
     Bottom:    2.6/100w
     → Name specific tools, companies, projects — not generic advice

  3. 🔴 [HIGH] Hook Type
     Top posts: Mostly 'contrarian' (50%)
     Bottom:    Mostly 'challenge' (50%)
     → Lead with contrarian hooks — they correlate with your best posts

📋 YOUR CONTENT BLUEPRINT:
  HOOK: Challenge conventional wisdom: 'Stop doing X. Here's why.'
  LENGTH: ~74 words, 14 sentences
  DATA: Include ~5 specific numbers/metrics
  VOICE: Personal — aim for 2+ first-person pronouns per 100 words
  CTA: End with a clear next step the reader can take

Research Basis

SourceKey FindingHow We Use It
-----------------------------------
Buffer 52M+ posts (2026)Dwell time > likes; specificity = 3-4x reachSpecificity scoring + feature correlation
LinkedIn 360Brew dataDocument posts = 596% more engagement than text; first 150 chars criticalHook type classification
Reddit 1000-post studyPosting time = 730% diff; question titles underperform by 16%Context notes in output (distribution vs content)
Stanford CS229Text features + sentiment predict Reddit post popularity12-feature extraction framework
LinkedIn 2026 benchmarks6.60% engagement for docs, 2-4% for text; 3-5 weekly optimalPlatform-aware recommendations

Collecting Your Post Data

LinkedIn

Export from LinkedIn Analytics → Content tab → CSV export, then convert to JSON format.

Reddit

# Use Reddit user history API
curl "https://www.reddit.com/user/USERNAME/submitted.json?limit=100" | python3 -c "
import json, sys
data = json.load(sys.stdin)
posts = [{'text': p['data']['title'] + ' ' + p['data'].get('selftext', ''),
          'platform': 'reddit',
          'engagement_rate': p['data']['upvote_ratio'] * 100,
          'impressions': p['data']['score']}
         for p in data['data']['children']]
json.dump(posts, sys.stdout, indent=2)
"

Manual

Create a JSON file with your posts and approximate engagement rates.

Technical Notes

  • Zero external dependencies — pure Python 3.7+ stdlib
  • Minimum 3 posts for meaningful analysis (8+ recommended)
  • Tier grouping: top 25%, middle 50%, bottom 25% by engagement_rate
  • Insight generation: only surfaces patterns with meaningful deltas (not noise)
  • JSON output: --format json for pipeline integration

Companion Skills

SkillRelationship
--------------------
phy-content-humanizer-auditChecks AI signature before posting (this tool = analyzes after posting)
phy-platform-rules-engineChecks platform-specific rules (this tool = analyzes content quality)
phy-content-compoundBuilds content atom library (this tool = informs which atoms work best)

Author

Canlah AI — Run performance marketing without breaking your brand.

版本历史

共 1 个版本

  • v1.0.3 当前
    2026-05-21 15:53

安全检测

腾讯云安全 (Keen)

队列中

腾讯云安全 (Sanbu)

队列中

🔗 相关推荐

content-creation

humanizer-zh

liuxy951129-cpu
去除文本中的 AI 生成痕迹。适用于编辑或审阅文本,使其听起来更自然、更像人类书写。 基于维基百科的"AI 写作特征"综合指南。检测并修复以下模式:夸大的象征意义、 宣传性语言、以 -ing 结尾的肤浅分析、模糊的归因、破折号过度使用、三段
★ 64 📥 30,710
content-creation

Marketing Skills

jchopard69
访问 23 个营销模块,提供转化率优化(CRO)、SEO、文案撰写、分析、发布、广告和社交媒体的清单、框架及可直接使用的交付物。
★ 145 📥 31,617
knowledge-management

Phy Lenny Mentor

phy041
由300+期Lenny播客驱动的AI产品导师。提炼Brian Chesky、Shreyas Doshi、April Dunford等世界级领袖的智慧。T...
★ 1 📥 671