tidyr unnest,在取消嵌套期间使用嵌套名称为列名称添加前缀

JWi*_*man 5 r unnest tidyr

unnest在a 上运行时,data.frame有没有办法将嵌套项的组名称添加到它包含的各个列(作为后缀或前缀)。或者是否必须通过手动完成重命名rename

这与“取消嵌套”包含同名列的多个组尤其相关。

在下面的示例中,base aggregate该命令做得很好(例如 Petal.Length.mn),但我找不到执行unnest相同操作的选项?

我使用nestwith是purrr::map因为我想要灵活地混合功能,例如。计算几个变量的均值和标准差,并运行测试以查看它们之间的差异。


library(dplyr, warn.conflicts = FALSE)

msd_c <- function(x) c(mn = mean(x), sd = sd(x))
msd_df <- function(x) bind_rows(c(mn = mean(x), sd = sd(x)))

aggregate(cbind(Petal.Length, Petal.Width) ~ Species, 
          data = iris, FUN = msd_c)
#>      Species Petal.Length.mn Petal.Length.sd Petal.Width.mn Petal.Width.sd
#> 1     setosa       1.4620000       0.1736640      0.2460000      0.1053856
#> 2 versicolor       4.2600000       0.4699110      1.3260000      0.1977527
#> 3  virginica       5.5520000       0.5518947      2.0260000      0.2746501

iris %>% 
  select(Petal.Length:Species) %>% 
  group_by(Species) %>% 
  tidyr::nest() %>% 
  mutate(
    Petal.Length = purrr::map(data, ~ msd_df(.$Petal.Length)),
    Petal.Width = purrr::map(data, ~ msd_df(.$Petal.Width)),
    Correlation = purrr::map(data, ~ broom::tidy(cor.test(.$Petal.Length, .$Petal.Width))),
  ) %>% 
  select(-data) %>% 
  tidyr::unnest(c(Petal.Length, Petal.Width, Correlation), names_repair = tidyr::tidyr_legacy)
#> # A tibble: 3 x 13
#> # Groups:   Species [3]
#>   Species    mn    sd   mn1   sd1 estimate statistic  p.value parameter conf.low
#>   <fct>   <dbl> <dbl> <dbl> <dbl>    <dbl>     <dbl>    <dbl>     <int>    <dbl>
#> 1 setosa   1.46 0.174 0.246 0.105    0.332      2.44 1.86e- 2        48   0.0587
#> 2 versic~  4.26 0.470 1.33  0.198    0.787      8.83 1.27e-11        48   0.651 
#> 3 virgin~  5.55 0.552 2.03  0.275    0.322      2.36 2.25e- 2        48   0.0481
#> # ... with 3 more variables: conf.high <dbl>, method <chr>, alternative <chr>
Run Code Online (Sandbox Code Playgroud)

由reprex 包(v0.3.0)于 2020-05-20 创建

JWi*_*man 11

答案有点明显,使用names_sep选项而不是names_repair选项。正如从nest帮助菜单中引用的names_sep

如果是字符串,则内部名称和外部名称将一起使用。在nest()中,新的外部列的名称将通过将外部列名称和内部列名称粘贴在一起形成,并以names_sep分隔。在 unnest() 中,新的内部名称将自动删除外部名称 (+names_sep)。这使得 names_sep 在嵌套和取消嵌套之间大致对称。


library(dplyr, warn.conflicts = FALSE)

msd_c <- function(x) c(mn = mean(x), sd = sd(x))
msd_df <- function(x) bind_rows(c(mn = mean(x), sd = sd(x)))

iris %>% 
  select(Petal.Length:Species) %>% 
  group_by(Species) %>% 
  tidyr::nest() %>% 
  mutate(
    Petal.Length = purrr::map(data, ~ msd_df(.$Petal.Length)),
    Petal.Width = purrr::map(data, ~ msd_df(.$Petal.Width)),
    Correlation = purrr::map(data, ~ broom::tidy(cor.test(.$Petal.Length, .$Petal.Width))),
  ) %>% 
  select(-data) %>% 
  tidyr::unnest(c(Petal.Length, Petal.Width, Correlation), names_sep = ".")
#> # A tibble: 3 x 13
#> # Groups:   Species [3]
#>   Species Petal.Length.mn Petal.Length.sd Petal.Width.mn Petal.Width.sd
#>   <fct>             <dbl>           <dbl>          <dbl>          <dbl>
#> 1 setosa             1.46           0.174          0.246          0.105
#> 2 versic~            4.26           0.470          1.33           0.198
#> 3 virgin~            5.55           0.552          2.03           0.275
#> # ... with 8 more variables: Correlation.estimate <dbl>,
#> #   Correlation.statistic <dbl>, Correlation.p.value <dbl>,
#> #   Correlation.parameter <int>, Correlation.conf.low <dbl>,
#> #   Correlation.conf.high <dbl>, Correlation.method <chr>,
#> #   Correlation.alternative <chr>
Run Code Online (Sandbox Code Playgroud)

由reprex 包(v0.3.0)于 2020-06-10 创建