使用dplyr计算熔化数据上值的出现次数

gia*_*iac 1 r melt reshape2 dplyr

我想在做完我的数据table之后做一个简单的事情melt,但是使用了dplyr.

我的数据看起来像这样

   cluster   21:30   21:45
4        c   alone   alone
6        b       %       %
12       e partner partner
14       b partner partner
20       b   alone   alone
22       c partner partner
Run Code Online (Sandbox Code Playgroud)

随着table我可以简单地

table(dta$cluster)
   a b c d e 
   2 8 5 1 4 
Run Code Online (Sandbox Code Playgroud)

如何使用melt和得到相同的结果summarise

 library(dplyr)
 library(reshape2)

 dta %>% 
 melt(id.vars = 'cluster')  %>% 
 group_by(cluster) %>% 
 summarise( n() ) 
Run Code Online (Sandbox Code Playgroud)

融化数据之后,我需要的是table群集.

所以要正确计算这个 data.frame

 dta %>% 
 melt(id.vars = 'cluster')
Run Code Online (Sandbox Code Playgroud)

预期的产出是这一个

      cluster variable   value n_cluster
1        a    21:30       .         2
2        a    21:30 nuclear         2
3        a    21:45       .         2
4        a    21:45 nuclear         2
5        b    21:30       %         8
6        b    21:30 partner         8
7        b    21:30   alone         8
8        b    21:30 partner         8
9        b    21:30 partner         8
10       b    21:30 nuclear         8
11       b    21:30 partner         8
12       b    21:30 partner         8
13       b    21:45       %         8
14       b    21:45 partner         8
15       b    21:45   alone         8
16       b    21:45 partner         8
17       b    21:45 partner         8
18       b    21:45 nuclear         8
19       b    21:45 partner         8
20       b    21:45 partner         8
21       c    21:30   alone         5
22       c    21:30 partner         5
23       c    21:30       %         5
24       c    21:30 partner         5
25       c    21:30 partner         5
26       c    21:45   alone         5
27       c    21:45 partner         5
28       c    21:45       %         5
29       c    21:45 partner         5
30       c    21:45 partner         5
31       d    21:30 partner         1
32       d    21:45   alone         1
33       e    21:30 partner         4
34       e    21:30 nuclear         4
35       e    21:30 nuclear         4
36       e    21:30 nuclear         4
37       e    21:45 partner         4
38       e    21:45 nuclear         4
39       e    21:45 nuclear         4
40       e    21:45 nuclear         4
Run Code Online (Sandbox Code Playgroud)

任何的想法?

dta = structure(list(cluster = structure(c(3L, 2L, 5L, 2L, 2L, 3L, 
5L, 3L, 1L, 3L, 1L, 2L, 5L, 3L, 2L, 2L, 2L, 2L, 4L, 5L), .Label = c("a", 
"b", "c", "d", "e"), class = "factor"), `21:30` = structure(c(2L, 
7L, 5L, 5L, 2L, 5L, 4L, 7L, 1L, 5L, 4L, 5L, 4L, 5L, 5L, 4L, 5L, 
5L, 5L, 4L), .Label = c(".", "alone", "children", "nuclear", 
"partner", "*", "%"), class = "factor"), `21:45` = structure(c(2L, 
7L, 5L, 5L, 2L, 5L, 4L, 7L, 1L, 5L, 4L, 5L, 4L, 5L, 5L, 4L, 5L, 
5L, 2L, 4L), .Label = c(".", "alone", "children", "nuclear", 
"partner", "*", "%"), class = "factor")), .Names = c("cluster", 
"21:30", "21:45"), row.names = c("4", "6", "12", "14", "20", 
"22", "23", "28", "30", "32", "36", "38", "40", "42", "44", "48", 
"50", "56", "57", "60"), class = "data.frame")
Run Code Online (Sandbox Code Playgroud)

Dav*_*urg 6

我似乎无法为此找到一个好的骗局,但一个简单的dplyr习惯用法就是用count

count(dta, cluster)
# Source: local data frame [5 x 2]
# 
#   cluster n
# 1       a 2
# 2       b 8
# 3       c 5
# 4       d 1
# 5       e 4
Run Code Online (Sandbox Code Playgroud)

根据您新的所需输出,您可以将此结果连接到您的熔化数据集

dta %>% 
  melt(id.vars = 'cluster')  %>% 
  left_join(., count(dta, cluster)) %>%
  arrange(cluster)
#    cluster variable   value n
# 1        a    21:30       . 2
# 2        a    21:30 nuclear 2
# 3        a    21:45       . 2
# 4        a    21:45 nuclear 2
# 5        b    21:30       % 8
# 6        b    21:30 partner 8
# 7        b    21:30   alone 8
#...
Run Code Online (Sandbox Code Playgroud)

  • 谢谢你用Excel避开我一晚!;)(开玩笑!) (2认同)