msc*_*lli 2 sorting r na dplyr
请考虑以下示例:
require(tibble)
require(dplyr)
set.seed(42)
tbl <- data_frame(id = letters[1:10], val = c(runif(5), NA, runif(4)))
tbl
Run Code Online (Sandbox Code Playgroud)
# A tibble: 10 × 2
id val
<chr> <dbl>
1 a 0.9148060435
2 b 0.9370754133
3 c 0.2861395348
4 d 0.8304476261
5 e 0.6417455189
6 f NA
7 g 0.5190959491
8 h 0.7365883146
9 i 0.1346665972
10 j 0.6569922904
Run Code Online (Sandbox Code Playgroud)
我想排序tibble的val,把NA第一个:
tbl %>%
arrange(val)
Run Code Online (Sandbox Code Playgroud)
# A tibble: 10 × 2
id val
<chr> <dbl>
1 i 0.1346665972
2 c 0.2861395348
3 g 0.5190959491
4 e 0.6417455189
5 j 0.6569922904
6 h 0.7365883146
7 d 0.8304476261
8 a 0.9148060435
9 b 0.9370754133
10 f NA
Run Code Online (Sandbox Code Playgroud)
不幸的是,NAs被放在最后arrange.
到目前为止我发现的最好的黑客是结合slice了旧的order:
tbl %>%
slice(order(.$val, na.last = FALSE))
Run Code Online (Sandbox Code Playgroud)
# A tibble: 10 × 2
id val
<chr> <dbl>
1 f NA
2 i 0.1346665972
3 c 0.2861395348
4 g 0.5190959491
5 e 0.6417455189
6 j 0.6569922904
7 h 0.7365883146
8 d 0.8304476261
9 a 0.9148060435
10 b 0.9370754133
Run Code Online (Sandbox Code Playgroud)
dplyr获得上述结果的方法是什么?
在排列'val'列之前,我们可以首先在向量arrange上logical
tbl %>%
arrange(!is.na(val), val)
# A tibble: 10 × 2
# id val
# <chr> <dbl>
#1 f NA
#2 i 0.1346666
#3 c 0.2861395
#4 g 0.5190959
#5 e 0.6417455
#6 j 0.6569923
#7 h 0.7365883
#8 d 0.8304476
#9 a 0.9148060
#10 b 0.9370754
Run Code Online (Sandbox Code Playgroud)