← 返回
内容创作 中文

Text

Transform, format, and process text with patterns for writing, data cleaning, localization, citations, and copywriting.
使用模式转换、格式化和处理文本,用于写作、数据清洗、本地化、引用和文案撰写。
ivangdavila
内容创作 clawhub v1.0.0 1 版本 100000 Key: 无需
★ 2
Stars
📥 1,671
下载
💾 104
安装
1
版本
#latest

概述

Quick Reference

TaskLoad
------------
Creative writing (voice, dialogue, POV)writing.md
Data processing (CSV, regex, encoding)data.md
Academic/citations (APA, MLA, Chicago)academic.md
Marketing copy (headlines, CTA, email)copy.md
Translation/localizationlocalization.md

Universal Text Rules

Encoding

  • Always verify encoding first: file -bi document.txt
  • Normalize line endings: tr -d '\r'
  • Remove BOM if present: sed -i '1s/^\xEF\xBB\xBF//'

Whitespace

  • Collapse multiple spaces: sed 's/[[:space:]]\+/ /g'
  • Trim leading/trailing: sed 's/^[[:space:]]//;s/[[:space:]]$//'

Common Traps

  • Smart quotes (" ") break parsers → normalize to "
  • Em/en dashes ( ) break ASCII → normalize to -
  • Zero-width chars invisible but break comparisons → strip them
  • String length ≠ byte length in UTF-8 ("café" = 4 chars, 5 bytes)

Format Detection

# Detect encoding
file -I document.txt

# Detect line endings
cat -A document.txt | head -1
# ^M at end = Windows (CRLF)
# No ^M = Unix (LF)

# Detect delimiter (CSV/TSV)
head -1 file | tr -cd ',;\t|' | wc -c

Quick Transformations

TaskCommand
---------------
Lowercasetr '[:upper:]' '[:lower:]'
Remove punctuationtr -d '[:punct:]'
Count wordswc -w
Count unique lines`sort -u \wc -l`
Find duplicates`sort \uniq -d`
Extract emailsgrep -oE '[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}'
Extract URLs`grep -oE 'https?://[^[:space:]<>"{}\\^\[\]]+'

Before Processing Checklist

  • [ ] Encoding verified (UTF-8?)
  • [ ] Line endings normalized
  • [ ] Delimiter identified (for structured text)
  • [ ] Target format/style defined
  • [ ] Edge cases considered (empty, Unicode, special chars)

版本历史

共 1 个版本

  • v1.0.0 当前
    2026-03-29 03:15 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

productivity

Word / DOCX

ivangdavila
创建、检查和编辑 Microsoft Word 文档及 DOCX 文件,支持样式、编号、修订记录、表格、分节符及兼容性检查等功能。
★ 438 📥 147,377
content-creation

Humanizer

biostartechnology
消除AI写作痕迹,使文本更自然真实。基于维基百科"AI写作特征"指南,识别并修正夸张象征、宣传用语、肤浅-ing分析、模糊归因、破折号滥用、三项排比、AI词汇、负面平行结构及冗长连接词等模式。
★ 858 📥 199,498
ai-intelligence

Self-Improving + Proactive Agent

ivangdavila
自我反思+自我批评+自我学习+自组织记忆。智能体评估自身工作、发现错误并持续改进。
★ 1,353 📥 317,923