我有一个这样的数据框:
df <- data.frame(id = c("A", "A", "A", "A", "A", "A", "A", "A",
"B", "B", "B", "B", "B", "B"),
var1 = c("100", "200", "300", NA, NA, NA, NA, NA,
"100", "200", "300", NA, NA, NA),
var2 = c("100", NA, NA, "400", "500", "600", NA, NA,
NA, NA, NA, "400", NA, NA),
var3 = c("200", NA, NA, NA, NA, NA, "700", "800",
"500", NA, NA, NA, "500", "600"))
Run Code Online (Sandbox Code Playgroud)
看起来像这样:
id var1 var2 var3
A 100 100 200
A 200 <NA> <NA>
A 300 <NA> <NA>
A <NA> 400 <NA>
A <NA> 500 <NA>
A <NA> 600 <NA>
A <NA> <NA> 700
A <NA> <NA> 800
B 100 <NA> 500
B 200 <NA> <NA>
B 300 <NA> <NA>
B <NA> 400 <NA>
B <NA> <NA> 500
B <NA> <NA> 600
Run Code Online (Sandbox Code Playgroud)
如果上面有缺失值(按组),我想在列中移动值.结果应如下所示:
id var1 var2 var3
A 100 100 200
A 200 400 700
A 300 500 800
A <NA> 600 <NA>
B 100 400 500
B 200 <NA> 500
B 300 <NA> 600
Run Code Online (Sandbox Code Playgroud)
我不知道该怎么做.有什么想法吗?
这是一个粗略的概念使用data.table可以改进:
library(data.table)
# Helper function:
shift_up <- function(x) {
n <- length(x)
x <- x[!is.na(x)]
length(x) <- n
x
}
setDT(df)
df[, lapply(.SD, shift_up), id][!(is.na(var1) & is.na(var2) & is.na(var3))]
id var1 var2 var3
1: A 100 100 200
2: A 200 400 700
3: A 300 500 800
4: A <NA> 600 <NA>
5: B 100 400 500
6: B 200 <NA> 500
7: B 300 <NA> 600
Run Code Online (Sandbox Code Playgroud)