tidyr::expand() for a single column across groups

Art*_*lov 5 r dplyr tidyr

tidyr::expand() returns all possible combinations of values from multiple columns. I'm looking for a slightly different behavior, where all the values are in a single column and the combinations are to be taken across groups.

For example, let the data be defined as follows:

library( tidyverse )
X <- bind_rows( data_frame(Group = "Group1", Value = LETTERS[1:3]),
                data_frame(Group = "Group2", Value = letters[4:5]) )
Run Code Online (Sandbox Code Playgroud)

We want all combinations of values from Group1 with values from Group2. My current clunky solution is to separate the values across multiple columns

Y <- X %>% group_by(Group) %>% do(vals = .$Value) %>% spread(Group, vals)
# # A tibble: 1 x 2
#   Group1    Group2   
#   <list>    <list>   
# 1 <chr [3]> <chr [2]>
Run Code Online (Sandbox Code Playgroud)

followed by a double unnest operation

Y %>% unnest( .preserve = Group2 ) %>% unnest
# # A tibble: 6 x 2
#   Group1 Group2
#   <chr>  <chr> 
# 1 A      d     
# 2 A      e     
# 3 B      d     
# 4 B      e     
# 5 C      d     
# 6 C      e     
Run Code Online (Sandbox Code Playgroud)

This is the desired output, but as you can imagine, this solution doesn't generalize well: as the number of groups increases, so does the number of unnest operations that we have to perform.

Is there a more elegant solution?

Hen*_*rik 5

因为OP似乎很乐意使用base,所以我将我的评论升级为答案:

expand.grid(split(X$Value, X$Group))
#   Group1 Group2
# 1      A      d
# 2      B      d
# 3      C      d
# 4      A      e
# 5      B      e
# 6      C      e
Run Code Online (Sandbox Code Playgroud)

正如 OP 所指出的,expand.grid将字符向量转换为因子。为了防止这种情况,请使用stringsAsFactors = FALSE.

等价tidyverse的是purrr::cross_df,它不会强制分解:

cross_df(split(X$Value, X$Group))
# A tibble: 6 x 2
# Group1 Group2
# <chr>  <chr> 
# 1 A      d     
# 2 B      d     
# 3 C      d     
# 4 A      e     
# 5 B      e     
# 6 C      e
Run Code Online (Sandbox Code Playgroud)