我想在一列中重新排序因子的级别,但在分组列定义的组内。
简单示例数据集:
df <- structure(list(a_factor = structure(1:6, .Label = c("a", "b",
"c", "d", "e", "f"), class = "factor"), group = structure(c(1L,
1L, 1L, 2L, 2L, 2L), .Label = c("group1", "group2"), class = "factor"),
value = 1:6), class = "data.frame", row.names = c(NA, -6L
))
> df
a_factor group value
1 a group1 1
2 b group1 2
3 c group1 3
4 d group2 4
5 e group2 5
6 f group2 6
Run Code Online (Sandbox Code Playgroud)
更准确地说,我如何重新排序因子级别,例如按valuewhere降序df$group == "group1",但按valuewhere升序df$group == "group2",最好在 dplyr 中?
预期的输出可能是:
> df
a_factor group value
1 c group1 3
2 b group1 2
3 a group1 1
4 d group2 4
5 e group2 5
6 f group2 6
Run Code Online (Sandbox Code Playgroud)
虽然,问题更普遍的是关于如何在 dplyr 中解决这个问题。
要重新排序因子级别,您可以使用forcats( 的一部分tidyverse),并执行类似的操作...
library(forcats)
df2 <- df %>% mutate(a_factor = fct_reorder(a_factor,
value*(-1 + 2 * (group=="group1"))))
levels(df2$a_factor)
[1] "f" "e" "d" "a" "b" "c"
Run Code Online (Sandbox Code Playgroud)
这不会重新排列数据框本身......
df2
a_factor group value
1 a group1 1
2 b group1 2
3 c group1 3
4 d group2 4
5 e group2 5
6 f group2 6
Run Code Online (Sandbox Code Playgroud)