← 返回
效率工具

R

Avoid common R mistakes — vectorization traps, NA propagation, factor surprises, and indexing gotchas.
避免常见 R 错误——向量化陷阱、NA 传播、因子意外、索引陷阱。
ivangdavila
效率工具 clawhub v1.0.0 1 版本 99842.5 Key: 无需
★ 3
Stars
📥 1,208
下载
💾 76
安装
1
版本
#latest

概述

Vectorization

  • Loops are slow — use apply(), lapply(), sapply(), or purrr::map()
  • Vectorized functions operate on whole vectors — sum(x) not for (i in x) total <- total + i
  • ifelse() is vectorized — if is not, use ifelse() for vector conditions
  • Column operations faster than row — R is column-major

Indexing Gotchas

  • R is 1-indexed — first element is x[1], not x[0]
  • x[0] returns empty vector — not error, silent bug
  • Negative index excludes — x[-1] removes first element
  • [[ extracts single element — [ returns subset (list stays list)
  • df[, 1] drops to vector — use df[, 1, drop = FALSE] to keep data frame

NA Handling

  • NA propagates — 1 + NA is NA, NA == NA is NA
  • Use is.na() to check — not x == NA
  • Most functions need na.rm = TRUEmean(x) returns NA if any NA present
  • na.omit() removes rows with any NA — may lose data unexpectedly
  • complete.cases() returns logical vector — rows without NA

Factor Traps

  • Old R converted strings to factors by default — use stringsAsFactors = FALSE or modern R
  • levels() shows categories — but factor values are integers internally
  • Adding new value not in levels gives NA — use factor(x, levels = c(old, new))
  • as.numeric(factor) gives level indices — use as.numeric(as.character(factor)) for values
  • Dropping unused levels: droplevels() — or factor() again

Recycling

  • Shorter vector recycled to match longer — c(1,2,3) + c(10,20) gives 11, 22, 13
  • No error if lengths aren't multiples — just warning, easy to miss
  • Single values recycle intentionally — x + 1 adds 1 to all elements

Data Frames vs Tibbles

  • Tibble never converts strings to factors — safer defaults
  • Tibble never drops dimensions — df[, 1] stays tibble
  • Tibble prints better — shows type, doesn't flood console
  • as_tibble() to convert — from tibble or dplyr package

Assignment

  • <- is idiomatic R — = works but avoided in style guides
  • <<- assigns to parent environment — global assignment, usually a mistake
  • -> right assignment exists — rarely used, confusing

Scope

  • Functions look up in parent environment — can accidentally use global variable
  • Local variable shadows global — same name hides outer variable
  • local() creates isolated scope — variables don't leak out

Common Mistakes

  • T and F can be overwritten — use TRUE and FALSE always
  • 1:length(x) fails on empty x — gives c(1, 0), use seq_along(x)
  • sample(5) vs sample(c(5)) — different! first gives 1:5 permutation
  • String splitting: strsplit() returns list — even for single string

版本历史

共 1 个版本

  • v1.0.0 当前
    2026-03-29 02:47 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

productivity

Word / DOCX

ivangdavila
创建、检查和编辑 Microsoft Word 文档及 DOCX 文件,支持样式、编号、修订记录、表格、分节符及兼容性检查等功能。
★ 439 📥 147,716
productivity

Baidu web search

ide-rea
使用百度AI搜索引擎(BDSE)进行网络搜索。适用于获取实时信息、文档资料或研究课题。
★ 237 📥 105,587
ai-intelligence

Self-Improving + Proactive Agent

ivangdavila
自我反思+自我批评+自我学习+自组织记忆。智能体评估自身工作、发现错误并持续改进。
★ 1,358 📥 318,462