Comprehensive CLI for detecting and transforming Chinese AI-generated text. Makes robotic AI writing natural and human-like.
v2.0 highlights: weighted 0-100 scoring, sentence-level analysis, sentence restructuring (merge/split), context-aware replacement, rhythm variation, vocabulary diversification, 7 style transforms, external pattern config (patterns_cn.json).
# Detect AI patterns (20+ categories, 0-100 score)
python scripts/detect_cn.py text.txt
python scripts/detect_cn.py text.txt -v # verbose + worst sentences
python scripts/detect_cn.py text.txt -s # score only
python scripts/detect_cn.py text.txt -j # JSON output
# Humanize text
python scripts/humanize_cn.py text.txt -o clean.txt
python scripts/humanize_cn.py text.txt --scene social
python scripts/humanize_cn.py text.txt --scene tech -a # aggressive mode
python scripts/humanize_cn.py text.txt --seed 42 # reproducible
# Apply writing styles
python scripts/style_cn.py text.txt --style zhihu -o zhihu.txt
python scripts/style_cn.py text.txt --style xiaohongshu
python scripts/style_cn.py --list
# Compare before/after
python scripts/compare_cn.py text.txt --scene tech -a
python scripts/compare_cn.py text.txt -o clean.txt
Weighted 0-100 score with 4 severity levels:
| Score | Level | Meaning |
|---|---|---|
| ------- | ------- | --------- |
| 0-24 | LOW | Likely human-written |
| 25-49 | MEDIUM | Some AI signals |
| 50-74 | HIGH | Probably AI-generated |
| 75-100 | VERY HIGH | Almost certainly AI |
| Category | Examples |
|---|---|
| ---------- | ---------- |
| Three-Part Structure | 首先...其次...最后, 一方面...另一方面, 其一...其二...其三 |
| Mechanical Connectors | 值得注意的是, 综上所述, 不难发现, 归根结底, 由此可见 |
| Empty Grand Words | 赋能, 闭环, 数字化转型, 协同增效, 全方位, 多维度 |
| Category | Examples |
|---|---|
| ---------- | ---------- |
| AI High-Frequency Words | 助力, 彰显, 底层逻辑, 抓手, 触达, 沉淀, 复盘 |
| Filler Phrases | 值得一提的是, 众所周知, 毫无疑问 |
| Balanced Arguments | 虽然...但是...同时, 既有...也有...更有 |
| Template Sentences | 随着...的不断发展, 在当今...时代, 作为...的重要组成部分 |
| Category | Examples |
|---|---|
| ---------- | ---------- |
| Hedging Language | 在一定程度上, 某种程度上, 通常情况下 (>5 occurrences) |
| List Addiction | Excessive numbered/bulleted lists |
| Punctuation Overuse | Dense em dashes, semicolons |
| Excessive Rhetoric | 对偶/排比句过多 |
| Category | Description |
|---|---|
| ---------- | ------------- |
| Uniform Paragraphs | Low CV in paragraph lengths |
| Low Burstiness | Monotonous sentence lengths |
| Emotional Flatness | Lack of emotional/personal expressions |
| Repetitive Starters | Same sentence starters >3 times |
| Low Entropy | Low character-level entropy (predictable text) |
With -v (verbose) mode, the detector identifies the most AI-like sentences:
── 最可疑句子 ──
1. [16分] 随着人工智能技术的不断发展,在当今数字化转型时代...
原因: 数字化转型, 深度融合, 模板: 随着.*?的(不断)?发展
| Scene | Casualness | Best For |
|---|---|---|
| ------- | ----------- | ---------- |
general | 0.3 | Default, balanced |
social | 0.7 | Social media, short posts |
tech | 0.3 | Tech blogs, tutorials |
formal | 0.1 | Formal articles, reports |
chat | 0.8 | Conversations, messaging |
-a)Adds +0.3 casualness, more colloquial expressions, stronger sentence restructuring. Typical score reduction: 60-80 points on heavily AI-generated text.
Use --seed N for reproducible results (same input + seed = same output).
7 specialized Chinese writing styles:
| Style | Name | Description |
|---|---|---|
| ------- | ------ | ------------- |
casual | 口语化 | Like chatting with friends — natural, relaxed |
zhihu | 知乎 | Rational, in-depth, personal opinions |
xiaohongshu | 小红书 | Enthusiastic, emoji-rich, product-focused |
wechat | 公众号 | Storytelling, engaging, relatable |
academic | 学术 | Rigorous, precise, no colloquialisms |
literary | 文艺 | Poetic, imagery-rich, metaphorical |
weibo | 微博 | Short, opinionated, shareable |
python scripts/humanize_cn.py text.txt --style xiaohongshu -o xhs.txt
This first humanizes (removes AI patterns) then applies the style transform.
All patterns, replacements, and scoring weights are in scripts/patterns_cn.json. Edit this file to:
python scripts/detect_cn.py [file] [-j] [-s] [-v] [--sentences N]
| Flag | Description |
|---|---|
| ------ | ------------- |
-j | JSON output |
-s | Score only (e.g. "72/100 (high)") |
-v | Verbose: show worst sentences |
--sentences N | Number of worst sentences to show (default: 5) |
python scripts/humanize_cn.py [file] [-o output] [--scene S] [--style S] [-a] [--seed N]
| Flag | Description |
|---|---|
| ------ | ------------- |
-o | Output file |
--scene | general/social/tech/formal/chat |
--style | casual/zhihu/xiaohongshu/wechat/academic/literary/weibo |
-a | Aggressive mode |
--seed | Random seed for reproducibility |
python scripts/style_cn.py [file] --style S [-o output] [--seed N] [--list]
python scripts/compare_cn.py [file] [-o output] [--scene S] [--style S] [-a]
Shows score diff, category changes, and metric comparison before/after humanization.
# 1. Check AI score
python scripts/detect_cn.py document.txt -v
# 2. Humanize with comparison
python scripts/compare_cn.py document.txt --scene tech -a -o clean.txt
# 3. Verify improvement
python scripts/detect_cn.py clean.txt -s
# 4. Optional: apply specific style
python scripts/style_cn.py clean.txt --style zhihu -o final.txt
# Scan all files
for f in *.txt; do
echo "=== $f ==="
python scripts/detect_cn.py "$f" -s
done
# Transform all markdown
for f in *.md; do
python scripts/humanize_cn.py "$f" --scene tech -a -o "${f%.md}_clean.md"
done
共 1 个版本