Sha*_*ath 1 pivot r dataframe melt tidyr
继续我的上一篇文章,我现在还有 1 列 ID 值,我需要用它来将行转换为列。
NUM <- c(1,2,3,1,2,3,1,2,3,1)
ID <- c("DJ45","DJ45","DJ45","DJ46","DJ46","DJ46","DJ47","DJ47","DJ47","DJ48")
Type <- c("A", "F", "C", "B", "D", "A", "E", "C", "F", "D")
Points <- c(9.2,60.8,22.9,1012.7,18.7,11.1,67.2,63.1,16.7,58.4)
df1 <- data.frame(ID,NUM,Type,Points)
df1:
+------+-----+------+--------+
| ID | Num | Type | Points |
+------+-----+------+--------+
| DJ45 | 1 | A | 9.2 |
| DJ45 | 2 | F | 60.8 |
| DJ45 | 3 | C | 22.9 |
| DJ46 | 1 | B | 1012.7 |
| DJ46 | 2 | D | 18.7 |
| DJ46 | 3 | A | 11.1 |
| DJ47 | 1 | E | 67.2 |
| DJ47 | 2 | C | 63.1 |
| DJ47 | 3 | F | 16.7 |
| DJ48 | 1 | D | 58.4 |
+------+-----+------+--------+
Run Code Online (Sandbox Code Playgroud)
我想要的输出是
+------+-----+------+--------+------+------+------+------+
| ID | Num | A | B | C | D | E | F |
+------+-----+------+--------+------+------+------+------+
| DJ45 | 1 | 9.2 | N/A | N/A | N/A | N/A | N/A |
| DJ45 | 2 | N/A | N/A | N/A | N/A | N/A | 60.8 |
| DJ45 | 3 | N/A | N/A | 22.9 | N/A | N/A | N/A |
| DJ46 | 1 | N/A | 1012.7 | N/A | N/A | N/A | N/A |
| DJ46 | 2 | N/A | N/A | N/A | 18.7 | N/A | N/A |
| DJ46 | 3 | 11.1 | N/A | N/A | N/A | N/A | N/A |
| DJ47 | 1 | N/A | N/A | N/A | N/A | 67.2 | N/A |
| DJ47 | 2 | N/A | N/A | 63.1 | N/A | N/A | N/A |
| DJ47 | 3 | N/A | N/A | N/A | N/A | N/A | 16.7 |
| DJ48 | 1 | N/A | N/A | N/A | 58.4 | N/A | N/A |
+------+-----+------+--------+------+------+------+------+
Run Code Online (Sandbox Code Playgroud)
我spread在 R 中使用函数但收到错误提示重复标识符。这是因为我现在有 2 列(ID 和 NUM),而不是我以前拥有的一列(NUM)。请让我知道我怎么能做到这一点。
不知道你试过什么,我建议:
spread(df1, Type, Points)
# ID NUM A B C D E F
# 1 DJ45 1 9.2 NA NA NA NA NA
# 2 DJ45 2 NA NA NA NA NA 60.8
# 3 DJ45 3 NA NA 22.9 NA NA NA
# 4 DJ46 1 NA 1012.7 NA NA NA NA
# 5 DJ46 2 NA NA NA 18.7 NA NA
# 6 DJ46 3 11.1 NA NA NA NA NA
# 7 DJ47 1 NA NA NA NA 67.2 NA
# 8 DJ47 2 NA NA 63.1 NA NA NA
# 9 DJ47 3 NA NA NA NA NA 16.7
# 10 DJ48 1 NA NA NA 58.4 NA NA
Run Code Online (Sandbox Code Playgroud)
如果您收到关于重复标识符的错误,那是因为实际数据中“ID”和“Num”的组合有一个或多个重复条目(在您的示例数据中,它们没有)。
如果是这种情况,您需要添加另一列以使其唯一。
添加dplyr到链中,它可能是这样的:
df1 %>%
group_by(ID, NUM) %>%
mutate(id2 = sequence(n())) %>%
spread(Type, Points)
Run Code Online (Sandbox Code Playgroud)
假设错误的演示:
df2 <- rbind(df1, df1[1:3, ]) ## Duplicate the first three rows
spread(df2, Type, Points)
# Error: Duplicate identifiers for rows (1, 11), (3, 13), (2, 12)
library(dplyr)
df2 %>%
group_by(ID, NUM) %>%
mutate(id2 = sequence(n())) %>%
spread(Type, Points)
# Source: local data frame [13 x 9]
#
# ID NUM id2 A B C D E F
# 1 DJ45 1 1 9.2 NA NA NA NA NA
# 2 DJ45 1 2 9.2 NA NA NA NA NA
# 3 DJ45 2 1 NA NA NA NA NA 60.8
# 4 DJ45 2 2 NA NA NA NA NA 60.8
# 5 DJ45 3 1 NA NA 22.9 NA NA NA
# 6 DJ45 3 2 NA NA 22.9 NA NA NA
# 7 DJ46 1 1 NA 1012.7 NA NA NA NA
# 8 DJ46 2 1 NA NA NA 18.7 NA NA
# 9 DJ46 3 1 11.1 NA NA NA NA NA
# 10 DJ47 1 1 NA NA NA NA 67.2 NA
# 11 DJ47 2 1 NA NA 63.1 NA NA NA
# 12 DJ47 3 1 NA NA NA NA NA 16.7
# 13 DJ48 1 1 NA NA NA 58.4 NA NA
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
2119 次 |
| 最近记录: |