Tidyverse:按组减少变量

dig*_*395 5 r tidyverse

我有一个如下所示的数据框:

ID  pick1      pick2     pick3
1   NA         21/11/29  21/11/30
2   21/11/28   21/11/29  NA
3   21/11/28   NA        21/11/30   
4   NA         21/11/29  21/11/30
Run Code Online (Sandbox Code Playgroud)

每个参与者(ID)可以从 3 个选项中选择 2 个日期。现在我想总结所选日期以获得如下小标题:

ID  date1      date2
1   21/11/29   21/11/30
2   21/11/28   21/11/29
3   21/11/28   21/11/30   
4   21/11/29   21/11/30
Run Code Online (Sandbox Code Playgroud)

但是,我无法仅使用 tidyverse 函数使其工作。我已经开始使用这个库,但在网上找不到我的问题的解决方案

Ice*_*can 5

您可以使用 data.table 从 @akrun 的答案中执行先转长后转回宽的方法。语法更简洁一些

df1 <- structure(list(ID = 1:4, pick1 = c(NA, "21/11/28", "21/11/28", 
NA), pick2 = c("21/11/29", "21/11/29", NA, "21/11/29"), pick3 = c("21/11/30", 
NA, "21/11/30", "21/11/30")), class = "data.frame",
 row.names = c(NA, 
-4L))

library(data.table)
setDT(df1)

dcast(
  melt(df1, 'ID', na.rm = TRUE),
  ID ~ paste0('date', rowid(ID)))
#>    ID    date1    date2
#> 1:  1 21/11/29 21/11/30
#> 2:  2 21/11/28 21/11/29
#> 3:  3 21/11/28 21/11/30
#> 4:  4 21/11/29 21/11/30
Run Code Online (Sandbox Code Playgroud)

由reprex 包于 2021 年 11 月 29 日创建(v2.0.1)


akr*_*run 4

一种选择是 with rowwise- 按行分组,将sortwith设置na.last为 TRUE,将排序后的输出保留在list,unnest到多列中,并且select仅包含至少一个非 NA 元素的列

\n
library(dplyr)\nlibrary(tidyr)\nlibrary(stringr)\n df1 %>% \n   rowwise %>% \n   transmute(ID, date = list(sort(c_across(starts_with('pick')), \n       na.last = TRUE))) %>% \n   ungroup %>%\n   unnest_wider(date) %>%\n   rename_with(~ str_c('date', seq_along(.)), -ID) %>%\n   select(where(~ any(!is.na(.))))\n
Run Code Online (Sandbox Code Playgroud)\n

-输出

\n
# A tibble: 4 \xc3\x97 3\n     ID date1    date2   \n  <int> <chr>    <chr>   \n1     1 21/11/29 21/11/30\n2     2 21/11/28 21/11/29\n3     3 21/11/28 21/11/30\n4     4 21/11/29 21/11/30\n
Run Code Online (Sandbox Code Playgroud)\n
\n

pivot_longer或通过删除s重新整形为“长”格式NA并将其重新整形回“宽”格式

\n
library(stringr)\ndf1 %>% \n   pivot_longer(cols = -ID, values_drop_na = TRUE) %>%\n   group_by(ID) %>% \n   mutate(name = str_c('date', row_number())) %>%\n   ungroup %>% \n   pivot_wider(names_from = name, values_from = value)\n
Run Code Online (Sandbox Code Playgroud)\n

-输出

\n
# A tibble: 4 \xc3\x97 3\n     ID date1    date2   \n  <int> <chr>    <chr>   \n1     1 21/11/29 21/11/30\n2     2 21/11/28 21/11/29\n3     3 21/11/28 21/11/30\n4     4 21/11/29 21/11/30\n
Run Code Online (Sandbox Code Playgroud)\n

数据

\n
df1 <- structure(list(ID = 1:4, pick1 = c(NA, "21/11/28", "21/11/28", \nNA), pick2 = c("21/11/29", "21/11/29", NA, "21/11/29"), pick3 = c("21/11/30", \nNA, "21/11/30", "21/11/30")), class = "data.frame",\n row.names = c(NA, \n-4L))\n
Run Code Online (Sandbox Code Playgroud)\n