do() 被取代!替代方案是使用 across()、nest_by() 和 summarise,如何实现?

Das*_*asr 6 r dplyr purrr tidyverse

我正在做一些非常简单的事情。给定特定期间的开始日期和结束日期的数据框,我想为按周分箱的每个期间扩展/创建完整序列(每行的因子),然后将其输出到单个大数据框中。

例如:

library(tidyverse)
library(lubridate)

# Dataset
  start_dates = ymd_hms(c("2019-05-08 00:00:00",
                          "2020-01-17 00:00:00",
                          "2020-03-03 00:00:00",
                          "2020-05-28 00:00:00",
                          "2020-12-10 00:00:00",
                          "2021-05-07 00:00:00",
                          "2022-01-04 00:00:00"), tz = "UTC")
  
  end_dates = ymd_hms(c( "2019-10-24 00:00:00",
                         "2020-03-03 00:00:00", 
                         "2020-05-28 00:00:00",
                         "2020-12-10 00:00:00",
                         "2021-05-07 00:00:00",
                         "2022-01-04 00:00:00",
                         "2022-01-19 00:00:00"), tz = "UTC") 
  
  df1 = data.frame(studying = paste0("period",seq(1:7),sep = ""),start_dates,end_dates)
Run Code Online (Sandbox Code Playgroud)

有人建议我使用 do(),它目前工作正常,但当事情被取代时我讨厌它。我也有一种使用map2的方法。但是阅读文件(https://dplyr.tidyverse.org/reference/do.html)建议您可以使用nest_by()、cross()和summarise()来完成与do()相同的工作,我会怎么做关于得到相同的结果?我尝试了很多东西,但我似乎无法得到它。

# do() way to do it
df1 %>% 
  group_by(studying) %>% 
  do(data.frame(week=seq(.$start_dates,.$end_dates,by="1 week")))
Run Code Online (Sandbox Code Playgroud)
# transmute() way to do it
 df1 %>% 
  transmute(weeks = map2(start_dates,end_dates, seq, by = "1 week"), studying) 
 %>% unnest(cols = c(weeks))
Run Code Online (Sandbox Code Playgroud)

Tim*_*Fan 3

正如 的文档?do所示,我们现在可以使用summarise并替换.across()

\n
library(tidyverse)\nlibrary(lubridate)\n\ndf1 %>% \n  group_by(studying) %>% \n  summarise(week = seq(across()$start_dates,\n                       across()$end_dates,\n                       by = "1 week"))\n#> `summarise()` has grouped output by \'studying\'. You can override using the\n#> `.groups` argument.\n#> # A tibble: 134 x 2\n#> # Groups:   studying [7]\n#>    studying week               \n#>    <chr>    <dttm>             \n#>  1 period1  2019-05-08 00:00:00\n#>  2 period1  2019-05-15 00:00:00\n#>  3 period1  2019-05-22 00:00:00\n#>  4 period1  2019-05-29 00:00:00\n#>  5 period1  2019-06-05 00:00:00\n#>  6 period1  2019-06-12 00:00:00\n#>  7 period1  2019-06-19 00:00:00\n#>  8 period1  2019-06-26 00:00:00\n#>  9 period1  2019-07-03 00:00:00\n#> 10 period1  2019-07-10 00:00:00\n#> # \xe2\x80\xa6 with 124 more rows\n
Run Code Online (Sandbox Code Playgroud)\n

由reprex 包于 2022 年 1 月 19 日创建(v0.3.0)

\n