我有以下数据:
> data <- data.frame(unique=1:9, grouping=rep(c('a', 'b', 'c'), each=3), value=sample(1:30, 9))
> data
unique grouping value
1 1 a 15
2 2 a 21
3 3 a 26
4 4 b 8
5 5 b 6
6 6 b 4
7 7 c 17
8 8 c 1
9 9 c 3
Run Code Online (Sandbox Code Playgroud)
我想创建一个如下所示的表:
a b c
1 15 8 17
2 21 6 1
3 26 6 3
Run Code Online (Sandbox Code Playgroud)
我使用tidyr :: spread而没有得到正确的结果:
> data %>% spread(grouping, value)
unique a b c
1 1 15 NA NA
2 2 21 NA NA
3 3 26 NA NA
4 4 NA 8 NA
5 5 NA 6 NA
6 6 NA 4 NA
7 7 NA NA 17
8 8 NA NA 1
9 9 NA NA 3
Run Code Online (Sandbox Code Playgroud)
要么
> data %>% select(grouping, value) %>% spread(grouping, value)
Error: Duplicate identifiers for rows (1, 2, 3), (4, 5, 6), (7, 8, 9)
Run Code Online (Sandbox Code Playgroud)
当一个组(c)的长度与其他组不同时,有没有办法做到这一点?
akr*_*run 10
我们需要创建一个序列列以避免重复的标识符行Error.
library(tidyr)
library(dplyr)
data %>%
group_by(grouping) %>%
mutate(id = row_number()) %>%
select(-unique) %>%
spread(grouping, value) %>%
select(-id)
# a b c
# (int) (int) (int)
#1 15 8 17
#2 21 6 1
#3 26 4 3
Run Code Online (Sandbox Code Playgroud)