交叉加入R中的dplyr

Question

交叉加入R中的dplyr

library(dplyr)
cust_time<-data.frame(cid=c("c1","c2","c3","c4","c5"),ts=c(2,7,11,13,17))
#I want to do a cross join on self, preferable in dplyr else base package is Ok
#But w/o renaming header names
#Currently I have to create a duplicate cust_time to do this.
cust_time.1<-rename(cust_time,cid1=cid,ts1=ts)
merge(cust_time,cust_time.1,by=NULL)

#Later I will want to do cross join within the grouped region
cust_time <-mutate(cust_time,ts.bucket=ts%/%10)
#If using duplicate tables, not sure, how to do the below
#group_by(cust_time,ts.bucket) %>%
#do cross join within this bucket

Run Code Online (Sandbox Code Playgroud)

基本上,我想在桌面上进行交叉自联接,但由于我不能使用dplyr解决方案,因此我使用了基本包.但它需要我重命名所有列.但是,我后来希望能够在分组级别进行交叉连接,这就是我遇到的绊脚石.
任何帮助赞赏.

Answer 1

Gre*_*gor 17

从dplyr1.0 版开始，您可以通过指定来进行交叉连接by = character()：

cust_time %>% full_join(cust_time, by = character())

Run Code Online (Sandbox Code Playgroud)

Answer 2

att*_*ool 11

您只需要一个虚拟列来加入:

cust_time$k <- 1
cust_time %>% 
  inner_join(cust_time, by='k') %>%
  select(-k)

Run Code Online (Sandbox Code Playgroud)

或者,如果您不想修改原始数据框:

cust_time %>%
  mutate(k = 1) %>%
  replicate(2, ., simplify=FALSE) %>%
  Reduce(function(a, b) inner_join(a, b, by='k'), .) %>%
  select(-k)

Run Code Online (Sandbox Code Playgroud)

Answer 3

Cur*_* F. 5

这是一个完全dplyr兼容的解决方案.它与attitude_stool的解决方案有许多相同的想法,但其优点是只有一条线.

require(magrittr)  # for the %<>% operator

# one line:
(cust_time %<>% mutate(foo = 1)) %>% 
        full_join(cust_time, by = 'foo') %>% 
        select(-foo)

Run Code Online (Sandbox Code Playgroud)

不修改原始数据： `cust_time %>% mutate(foo=1) %>% full_join(.,., by="foo") %>% select(-foo)` (2认同)

归档时间：	9 年，8 月前
查看次数：	8422 次
最近记录：	9 年，8 月前