我正在处理一个大型数据集,其中大部分数据被输入两次。这意味着许多变量由成对的列表示:column.1数据由一个人输入,column.2相同的数据由另一个人输入。我想创建一个名为简单的“主”列,column首先从 中提取column.1,然后,如果column.1是NA,则从 中提取column.2。
这是我尝试对虚构数据执行的操作的示例:
mydata <- data.frame(name = c("Sarah","Ella","Carmen","Dinah","Billie"),
cheese.1 = c(1,4,NA,6,NA),
cheese.2 = c(1,4,3,5,NA),
milk.1 = c(NA,2,0,4,NA),
milk.2 = c(1,2,1,4,2),
tofu.1 = c("yum","yum",NA,"gross", NA),
tofu.2 = c("gross", "yum", "yum", NA, "gross"))
Run Code Online (Sandbox Code Playgroud)
例如,下面的代码显示了我想要对一对列执行的操作的示例。
mydata %>% mutate(cheese = ifelse(is.na(cheese.1), cheese.2, cheese.1))
#OUTPUT:
name cheese.1 cheese.2 milk.1 milk.2 tofu.1 tofu.2 cheese
1 Sarah 1 1 NA 1 yum gross 1
2 Ella 4 4 2 2 yum yum …Run Code Online (Sandbox Code Playgroud)