我的数据目前看起来像这样,列“Number_Code 基于每个不同的 Side_Effect:
Session_ID Side_Effect Number_Code
1 anxious 1
1 dizzy 2
1 relaxed 3
3 dizzy 2
7 nauseous 4
7 anxious 1
Run Code Online (Sandbox Code Playgroud)
我知道我可以做到:
mutate(rn = str_c('side_effect_', row_number())) %>%
pivot_wider(names_from = rn, values_from = Side_Effect)
Run Code Online (Sandbox Code Playgroud)
为了创建新的列名并将每个副作用放入一个新列中,如下所示:
session Number_Code side_effect1 side effect_2 side_effect_3
1 1 anxious NA NA
1 2 NA dizzy NA
1 3 NA NA relaxed
3 2 dizzy NA NA
7 4 nauseous NA NA
7 1 NA anxious NA
Run Code Online (Sandbox Code Playgroud)
但我需要根据“Side_Effect”和“Number_Code”扩大数据,并将它们放在这样的交替列中:
session side_effect1 number_code1 side effect_2 number_code2 side_effect_3 number_code3
1 anxious 1 dizzy 2 relaxed 3
3 dizzy 2 NA NA NA NA
7 nauseous 4 anxious 1 NA NA
Run Code Online (Sandbox Code Playgroud)
我看到另一篇文章,他们根据两个变量扩大了数据,但第二个的所有列都在第一个的所有列之后。有没有办法让他们像这样交替?谢谢!!
该pivot_wider可采取多value_from列,所以通过创建组序列后,使用pivot_wider与values_from指定感兴趣的列
library(dplyr)
library(tidyr)
df1 %>%
group_by(Session_ID) %>%
mutate(rn = row_number()) %>%
ungroup %>%
pivot_wider(names_from = rn, values_from = c(Side_Effect, Number_Code))
# A tibble: 3 x 7
# Session_ID Side_Effect_1 Side_Effect_2 Side_Effect_3 Number_Code_1 Number_Code_2 Number_Code_3
# <int> <chr> <chr> <chr> <int> <int> <int>
#1 1 anxious dizzy relaxed 1 2 3
#2 3 dizzy <NA> <NA> 2 NA NA
#3 7 nauseous anxious <NA> 4 1 NA
Run Code Online (Sandbox Code Playgroud)
如果我们需要重新排列列顺序,那么我们可以select根据数字部分和order
df1 %>%
group_by(Session_ID) %>%
mutate(rn = row_number()) %>%
ungroup %>%
pivot_wider(names_from = rn, values_from = c(Side_Effect, Number_Code)) %>%
select(Session_ID, names(.)[-1][order(readr::parse_number(names(.)[-1]))] )
# A tibble: 3 x 7
# Session_ID Side_Effect_1 Number_Code_1 Side_Effect_2 Number_Code_2 Side_Effect_3 Number_Code_3
# <int> <chr> <int> <chr> <int> <chr> <int>
#1 1 anxious 1 dizzy 2 relaxed 3
#2 3 dizzy 2 <NA> NA <NA> NA
#3 7 nauseous 4 anxious 1 <NA> NA
Run Code Online (Sandbox Code Playgroud)
df1 <- structure(list(Session_ID = c(1L, 1L, 1L, 3L, 7L, 7L),
Side_Effect = c("anxious",
"dizzy", "relaxed", "dizzy", "nauseous", "anxious"), Number_Code = c(1L,
2L, 3L, 2L, 4L, 1L)), class = "data.frame", row.names = c(NA,
-6L))
Run Code Online (Sandbox Code Playgroud)
tidyr 1.2.0使用参数可以轻松实现这一点names_vary,其中“最慢”给出交替顺序,“最快”(默认)给出块顺序。
# Alternating \n\ndat |>\n mutate(rn = row_number(), .by = Session_ID) |>\n pivot_wider(\n names_from = rn,\n values_from = c(Side_Effect, Number_Code),\n names_vary = "slowest"\n )\n\n# A tibble: 3 \xc3\x97 7\n Session_ID Side_Effect_1 Number_Code_1 Side_Effect_2 Number_Code_2 Side_Effect_3 Number_Code_3\n <dbl> <chr> <dbl> <chr> <dbl> <chr> <dbl>\n1 1 anxious 1 dizzy 2 relaxed 3\n2 3 dizzy 2 NA NA NA NA\n3 7 nauseous 4 anxious 1 NA NA\n\n# Block\n\ndat |>\n mutate(rn = row_number(), .by = Session_ID) |>\n pivot_wider(\n names_from = rn,\n values_from = c(Side_Effect, Number_Code),\n names_vary = "fastest"\n )\n\n# A tibble: 3 \xc3\x97 7\n Session_ID Side_Effect_1 Side_Effect_2 Side_Effect_3 Number_Code_1 Number_Code_2 Number_Code_3\n <dbl> <chr> <chr> <chr> <dbl> <dbl> <dbl>\n1 1 anxious dizzy relaxed 1 2 3\n2 3 dizzy NA NA 2 NA NA\n3 7 nauseous anxious NA 4 1 NA\nRun Code Online (Sandbox Code Playgroud)\n第一次出现是从参数的顺序复制的values_from,因此如果需要,Number_Code出现在Side_Effect参数之前的应该是values_from = c(Number_Code, Side_Effect)。
| 归档时间: |
|
| 查看次数: |
561 次 |
| 最近记录: |