Maë*_*aël 4 r data-manipulation dataframe dplyr
假设我有一个这样的数据集:
\ndat <- tibble(id = 1:4, \n col1 = c(0, 1, 1, 0),\n col2 = c(1, 0, 1, 0),\n col3 = c(1, 1, 0, 1))\n\n> dat\n# A tibble: 4 \xc3\x97 4\n id col1 col2 col3\n <int> <dbl> <dbl> <dbl>\n1 1 0 1 1\n2 2 1 0 1\n3 3 1 1 0\n4 4 0 0 1\nRun Code Online (Sandbox Code Playgroud)\n我想对于每个唯一的 id,将多个 1 分成多行,即预期输出是:
\n# A tibble: 7 \xc3\x97 4\n id col1 col2 col3\n <dbl> <dbl> <dbl> <dbl>\n1 1 0 1 0\n2 1 0 0 1\n3 2 1 0 0\n4 2 0 0 1\n5 3 1 0 0\n6 3 0 1 0\n7 4 0 0 1\nRun Code Online (Sandbox Code Playgroud)\n对于第一个 id (id = 1),col2 和 col3 都是 1,所以我想为它们每个单独的行。它有点像行的 one-hot 编码。
\n在 Ritchie Sacramento 和 RobertoT 的帮助下
\nlibrary(tidyverse)\n\ndat <- tibble(id = 1:4, \n col1 = c(0, 1, 1, 0),\n col2 = c(1, 0, 1, 0),\n col3 = c(1, 1, 0, 1))\n\ndat %>% \n pivot_longer(-id) %>% \n filter(value != 0) %>% \n mutate(rows = 1:nrow(.)) %>% \n pivot_wider(values_fill = 0, \n names_sort = TRUE) %>% \n select(-rows)\n\n# A tibble: 7 \xc3\x97 4\n id col1 col2 col3\n <int> <dbl> <dbl> <dbl>\n1 1 0 1 0\n2 1 0 0 1\n3 2 1 0 0\n4 2 0 0 1\n5 3 1 0 0\n6 3 0 1 0\n7 4 0 0 1\nRun Code Online (Sandbox Code Playgroud)\n