R sort按组总和汇总ddply

Lil*_*eco 6 pivot-table r plyr

我有一个像这样的data.frame

x <- data.frame(Category=factor(c("One", "One", "Four", "Two","Two",
"Three", "Two", "Four","Three")),
City=factor(c("D","A","B","B","A","D","A","C","C")),
Frequency=c(10,1,5,2,14,8,20,3,5))

  Category City Frequency
1      One    D        10
2      One    A         1
3     Four    B         5
4      Two    B         2
5      Two    A        14
6    Three    D         8
7      Two    A        20
8     Four    C         3
9    Three    C         5
Run Code Online (Sandbox Code Playgroud)

我想用sum(频率)制作一个数据透视表,并使用ddply函数,如下所示:

ddply(x,.(Category,City),summarize,Total=sum(Frequency))
  Category City Total
1     Four    B     5
2     Four    C     3
3      One    A     1
4      One    D    10
5    Three    C     5
6    Three    D     8
7      Two    A    34
8      Two    B     2
Run Code Online (Sandbox Code Playgroud)

但我需要按每个类别组中的总数排序此结果.像这样的东西:

Category City Frequency
1      Two    A        34
2      Two    B         2
3    Three    D        14
4    Three    C         5
5      One    D        10
6      One    A         1
7     Four    B         5
8     Four    C         3
Run Code Online (Sandbox Code Playgroud)

我看了看,尝试排序,排序,安排,但似乎没有什么做我需要的.我怎么能在R中这样做?

Dav*_*urg 5

这是一个很好的问题,我不能想到这样做的直接方式,而不是创建一个总大小索引然后按它排序.这是一种可能的data.table方法,它使用setorder函数,通过引用对您的数据进行排序

library(data.table)
Res <- setDT(x)[, .(Total = sum(Frequency)), by = .(Category, City)]
setorder(Res[, size := sum(Total), by = Category], -size, -Total, Category)[]
#    Category City Total size
# 1:      Two    A    34   36
# 2:      Two    B     2   36
# 3:    Three    D     8   13
# 4:    Three    C     5   13
# 5:      One    D    10   11
# 6:      One    A     1   11
# 7:     Four    B     5    8
# 8:     Four    C     3    8
Run Code Online (Sandbox Code Playgroud)

或者,如果你深入Hdleyverse,我们可以使用更新的dplyr包得到类似的结果(由@akrun建议)

library(dplyr)
x %>% 
  group_by(Category, City) %>% 
  summarise(Total = sum(Frequency)) %>% 
  mutate(size= sum(Total)) %>% 
  ungroup %>%
  arrange(-size, -Total, Category)
Run Code Online (Sandbox Code Playgroud)


Bro*_*ieG 5

这是基本的R版本,调用DF的结果在哪里ddply

with(DF, DF[order(-ave(Total, Category, FUN=sum), Category, -Total), ])
Run Code Online (Sandbox Code Playgroud)

产生:

  Category City Total
7      Two    A    34
8      Two    B     2
6    Three    D     8
5    Three    C     5
4      One    D    10
3      One    A     1
1     Four    B     5
2     Four    C     3
Run Code Online (Sandbox Code Playgroud)

逻辑基本上与David的逻辑相同,计算Total每个的总和,对每个Category行中的所有行使用该数字Category(我们使用进行此操作ave(..., FUN=sum)),然后按该数字加上一些平局决胜符进行排序,以确保出现预期的结果。