Sam*_*Sam 3 r missing-data dplyr
我想var1:var6仅将列中缺失值少于 2 个的行的列中的缺失值替换为 0 var1:var6。然后我想重新计算总和列(我可以很高兴地rowwise()按照我的代表使用它)。
across()我已经使用, 或rowwise()尝试了一些方法,c_across()但正在努力寻找解决方案。
library(tidyverse)\n\n# Generate data\nset.seed(40)\ndat <- tibble(\n id = 1:6,\n var1 = sample(c(0:4, NA), 6, replace = TRUE),\n var2 = sample(c(0:4, NA), 6, replace = TRUE),\n var3 = sample(c(0:4, NA), 6, replace = TRUE),\n var4 = sample(c(0:4, NA), 6, replace = TRUE),\n var5 = sample(c(0:4, NA), 6, replace = TRUE),\n var6 = sample(c(0:4, NA), 6, replace = TRUE),\n)\n\ndat %>%\n rowwise() %>%\n mutate(sum = sum(c_across(var1:var6))) %>%\n ungroup()\nRun Code Online (Sandbox Code Playgroud)\n这是当前的小标题:
\n> dat\n# A tibble: 6 \xc3\x97 8\n id var1 var2 var3 var4 var5 var6 sum\n <int> <int> <int> <int> <int> <int> <int> <int>\n1 1 3 4 4 NA NA 2 NA\n2 2 NA NA 4 3 4 2 NA\n3 3 4 4 1 1 4 1 15\n4 4 1 2 4 4 4 NA NA\n5 5 2 1 4 4 NA 2 NA\n6 6 1 3 1 0 0 4 9\nRun Code Online (Sandbox Code Playgroud)\n我希望输出看起来像这样:
\n> new_dat\n# A tibble: 6 \xc3\x97 8\n id var1 var2 var3 var4 var5 var6 sum\n <int> <int> <int> <int> <int> <int> <int> <int>\n1 1 3 4 4 NA NA 2 NA\n2 2 NA NA 4 3 4 2 NA\n3 3 4 4 1 1 4 1 15\n4 4 1 2 4 4 4 0 15\n5 5 2 1 4 4 0 2 13\n6 6 1 3 1 0 0 4 9\nRun Code Online (Sandbox Code Playgroud)\n
across你可以像这样使用:
dat %>% \n mutate(across(var1:var6, ~ replace(.x, is.na(.x) & rowSums(is.na(across(var1:var6))) < 2, 0)),\n sum = rowSums(across(var1:var6)))\n\n# # A tibble: 6 \xc3\x97 8\n# id var1 var2 var3 var4 var5 var6 sum\n# <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>\n# 1 1 3 4 4 NA NA 2 NA\n# 2 2 NA NA 4 3 4 2 NA\n# 3 3 4 4 1 1 4 1 15\n# 4 4 1 2 4 4 4 0 15\n# 5 5 2 1 4 4 0 2 13\n# 6 6 1 3 1 0 0 4 9\nRun Code Online (Sandbox Code Playgroud)\n