如何在dplyr链中将数据框连接到自身?

cra*_*lly 5 r dplyr magrittr

偶尔,我需要在dplyr链中将数据框加入到(通常是修改过的)版本中.像这样的东西:

df  <- data.frame(
     id = c(1,2,3)
   , status = c('foo','bar','meh')
   , spouseid = c(4,3,2)
)


df %>% 
  filter( status == 'foo' | status == 'bar') %>% 
  # join the filtered table to itself using the dot as the right-hand side
  left_join(., by = c('id' = 'spouseid'))
Run Code Online (Sandbox Code Playgroud)

当我尝试的时候,我明白了Error in is.data.frame(y) : argument "y" is missing, with no default.

cra*_*lly 6

问题是使用点只是在左手边移动,所以上面写的方式只是将lhs传递进去left_join().要在左侧和右侧使用点,请使用点两次:

df %>% 
  filter( status == 'foo' | status == 'bar') %>% 
  # the first dot is x argument and the second dot is the y argument
  left_join(
      x = . 
    , y = . 
    , by = c('id' = 'spouseid')
  )
Run Code Online (Sandbox Code Playgroud)

通过这种方式,你将lhs传递给两个参数,left_join()而不是像往常那样依赖于magrittr的隐式lhs.

  • 似乎不起作用: df &lt;- data.frame(Team = c("A", "A", "A", "A", "B", "B", "B", "C", "C", "D", "D"), 日期 = c("2016-05-10","2016-05-10", "2016-05-10", "2016-05-10", " 2016-05-12"、"2016-05-12"、"2016-05-12"、"2016-05-15"、"2016-05-15"、"2016-05-30"、"2016- 05-30"), Points = c(1,4,3,2,1,5,6,1,2,3,9) ) df %&gt;% left_join(., . %&gt;% distinct(Team, Date) ) %&gt;% mutate(Date_Lagged = lag(Date))) (2认同)