在dplyr或tidyr中执行类似于rbind的操作?

jal*_*pic 7 r rbind dplyr tidyr

用以下数据说,我对每个水果有多少独特合作伙伴的问题感兴趣?

我的df:

       fruit1 fruit2
    1   guava   kiwi
    2   lemon   pear
    3    pear  apple
    4   guava   kiwi
    5    pear  guava
    6   apple   kiwi
    7  banana  lemon
    8   lemon   kiwi
    9   apple banana
    10  lemon  guava
Run Code Online (Sandbox Code Playgroud)

我正试图掌握dplyr和tidyr.为此,我认为使用n_distinct()in dplyr 会很好.我做了以下事情:

rbind (df %>%select(fruita=fruit1,fruitb=fruit2), 
       df %>%select(fruita=fruit2,fruitb=fruit1)) %>%
  group_by(fruita) %>%
  summarise(Partners=n_distinct(fruitb)) %>%
  arrange(desc(Partners))
Run Code Online (Sandbox Code Playgroud)

这基本上复制了下面的10行,但是在下半部分切换了水果的顺序.然后我计算新的第一列中的每个水果,它在新的第二列中使用了多少独特的伴侣水果n_distinct().

这工作得很好,但考虑到如何优雅dplyrtidyr有,我想知道是否有这样做的更有效的方法,尤其是如果有执行的方式rbind,如使用此这些包的一个?

最终数据如下所示:

  fruita Partners
1  lemon        4
2  apple        3
3  guava        3
4   pear        3
5   kiwi        3
6 banana        2
Run Code Online (Sandbox Code Playgroud)

复制数据:

structure(list(fruit1 = structure(c(3L, 4L, 5L, 3L, 5L, 1L, 2L, 
4L, 1L, 4L), .Label = c("apple", "banana", "guava", "lemon", 
"pear"), class = "factor"), fruit2 = structure(c(4L, 6L, 1L, 
4L, 3L, 4L, 5L, 4L, 2L, 3L), .Label = c("apple", "banana", "guava", 
"kiwi", "lemon", "pear"), class = "factor")), .Names = c("fruit1", 
"fruit2"), class = "data.frame", row.names = c(NA, -10L))
Run Code Online (Sandbox Code Playgroud)

akr*_*run 7

不确定这是否有帮助:

df %>% 
do(data.frame(fruita=unlist(.), fruitb=unlist(.[,2:1]))) %>%
group_by(fruita) %>% 
summarise(Partners=n_distinct(fruitb)) %>% 
arrange(desc(Partners))
#Source: local data frame [6 x 2]

#    fruita Partners
#  1  lemon        4
#  2  apple        3
#  3  guava        3
#  4   pear        3
#  5   kiwi        3
#  6 banana        2
Run Code Online (Sandbox Code Playgroud)

  • 在重新思考之后,我注意到`do`行是迷信.这也应该工作:`data.frame(fruita = unlist(df),fruitb = unlist(df [,2:1]))%>%`等 (2认同)