sma*_*ski 4 sorting grouping r
I have a data frame with lot of company information separated by an id variable. I want to sort one of the variables and repeat it for every id. Let's take this example,
df <- structure(list(id = c(110, 110, 110, 90, 90, 90, 90, 252, 252
), var1 = c(26, 21, 54, 10, 18, 9, 16, 54, 39), var2 = c(234,
12, 43, 32, 21, 19, 16, 34, 44)), .Names = c("id", "var1", "var2"
), row.names = c(NA, -9L), class = "data.frame")
Run Code Online (Sandbox Code Playgroud)
Which looks like this
df
id var1 var2
1 110 26 234
2 110 21 12
3 110 54 43
4 90 10 32
5 90 18 21
6 90 9 19
7 90 16 16
8 252 54 34
9 252 39 44
Run Code Online (Sandbox Code Playgroud)
Now, I want to sort the data frame according to var1
by the vector id
. Easiest solution I can think of is using apply
function like this,
> apply(df, 2, sort)
id var1 var2
[1,] 90 9 12
[2,] 90 10 16
[3,] 90 16 19
[4,] 90 18 21
[5,] 110 21 32
[6,] 110 26 34
[7,] 110 39 43
[8,] 252 54 44
[9,] 252 54 234
Run Code Online (Sandbox Code Playgroud)
However, this is not the output I am seeking. The correct output should be,
id var1 var2
1 110 21 12
2 110 26 234
3 110 54 43
4 90 9 19
5 90 10 32
6 90 16 16
7 90 18 21
8 252 39 44
9 252 54 34
Run Code Online (Sandbox Code Playgroud)
Group by id
and sort by var1
column and keep original id
column order.
Any idea how to sort like this?
使用order
和的另一个基本R选项match
df[with(df, order(match(id, unique(id)), var1, var2)), ]
# id var1 var2
#2 110 21 12
#1 110 26 234
#3 110 54 43
#6 90 9 19
#4 90 10 32
#7 90 16 16
#5 90 18 21
#9 252 39 44
#8 252 54 34
Run Code Online (Sandbox Code Playgroud)
注意。如Moody_Mudskipper所述,无需使用tidyverse
,也可以使用base轻松完成R
:
df[order(ordered(df$id, unique(df$id)), df$var1), ]
Run Code Online (Sandbox Code Playgroud)
没有tidyverse
任何temp
变量的单线解决方案:
library(tidyverse)
df %>% arrange(ordered(id, unique(id)), var1)
# id var1 var2
# 1 110 26 234
# 2 110 21 12
# 3 110 54 43
# 4 90 10 32
# 5 90 18 21
# 6 90 9 19
# 7 90 16 16
# 8 252 54 34
# 9 252 39 44
Run Code Online (Sandbox Code Playgroud)
为什么apply(df, 2, sort)
不起作用的解释
您试图做的是对每个列进行独立排序。apply
在指定的维度上运行(2
在这种情况下,它对应于列)并应用函数(sort
在这种情况下)。
apply
尝试进一步简化结果,在这种情况下简化为矩阵。因此,您将获得一个矩阵(不是 a data.frame
),其中每一列都是独立排序的。例如,apply
呼叫中的这一行:
# [1,] 90 9 12
Run Code Online (Sandbox Code Playgroud)
甚至根本不存在data.frame
。
归档时间: |
|
查看次数: |
111 次 |
最近记录: |