itp*_*sen 2 r data-manipulation dataframe tidyr tidyverse
I\xe2\x80\x99m 尝试将R. 我正在尝试使用 使所有列更宽(唯一标识观察值的列除外)pivot_wider()。这是一个最小的工作示例:
library("tidyr")\n\nset.seed(12345)\n\nsampleSize <- 10\ntimepoints <- 3\nraters <- 2\n\ndata_long <- data.frame(ID = rep(1:sampleSize, each = timepoints * raters),\n time = rep(1:timepoints, times = sampleSize * raters),\n rater = rep(c("a","b"), times = sampleSize * timepoints),\n v1 = sample.int(99, sampleSize * timepoints * raters, replace = TRUE),\n v2 = sample.int(99, sampleSize * timepoints * raters, replace = TRUE),\n v3 = sample.int(99, sampleSize * timepoints * raters, replace = TRUE),\n v100 = sample.int(99, sampleSize * timepoints * raters, replace = TRUE),\n vA = sample.int(99, sampleSize * timepoints * raters, replace = TRUE),\n vB = sample.int(99, sampleSize * timepoints * raters, replace = TRUE),\n vC = sample.int(99, sampleSize * timepoints * raters, replace = TRUE),\n vZZ = sample.int(99, sampleSize * timepoints * raters, replace = TRUE))\nRun Code Online (Sandbox Code Playgroud)\n以下是数据:
\n> tibble(data_long)\n# A tibble: 60 x 11\n ID time rater v1 v2 v3 v100 vA vB vC vZZ\n <int> <int> <chr> <int> <int> <int> <int> <int> <int> <int> <int>\n 1 1 1 a 14 56 30 75 66 22 8 73\n 2 1 1 b 90 44 99 8 36 72 1 78\n 3 1 2 a 92 35 93 46 4 68 39 52\n 4 1 2 b 51 91 50 67 43 72 99 74\n 5 1 3 a 80 34 31 31 21 52 7 23\n 6 1 3 b 24 86 25 86 20 43 74 89\n 7 2 1 a 58 51 48 60 6 56 66 37\n 8 2 1 b 96 95 76 1 78 2 65 3\n 9 2 2 a 88 26 92 86 7 37 84 15\n10 2 2 b 93 55 25 62 27 39 73 85\n# ... with 50 more rows\nRun Code Online (Sandbox Code Playgroud)\n在此示例中,我有三列唯一标识所有观测值:ID、time和rater。I\xe2\x80\x99d 喜欢加宽每隔一列rater(即,排除ID和time列)。我的预期输出是:
# A tibble: 30 x 18\n ID time v1_a v1_b v2_a v2_b v3_a v3_b v100_a v100_b vA_a vA_b vB_a vB_b vC_a vC_b vZZ_a vZZ_b\n <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int>\n 1 1 1 14 90 56 44 30 99 75 8 66 36 22 72 8 1 73 78\n 2 1 2 92 51 35 91 93 50 46 67 4 43 68 72 39 99 52 74\n 3 1 3 80 24 34 86 31 25 31 86 21 20 52 43 7 74 23 89\n 4 2 1 58 96 51 95 48 76 60 1 6 78 56 2 66 65 37 3\n 5 2 2 88 93 26 55 92 25 86 62 7 27 37 39 84 73 15 85\n 6 2 3 75 2 23 55 28 8 66 74 65 92 58 10 91 65 7 44\n 7 3 1 86 94 7 87 78 85 38 87 36 49 89 83 33 34 32 38\n 8 3 2 10 75 12 15 21 18 56 77 54 17 61 92 18 50 98 27\n 9 3 3 38 81 46 90 20 47 88 15 33 95 66 19 12 27 84 52\n10 4 1 32 38 88 68 77 71 10 81 21 54 33 16 90 41 29 72\n# ... with 20 more rows\nRun Code Online (Sandbox Code Playgroud)\n我可以使用以下语法扩大任何给定的列:
\ndata_long %>% \n pivot_wider(names_from = rater, values_from = c(v1, v2))\nRun Code Online (Sandbox Code Playgroud)\n因此,我可以通过在向量中手动输入所有列来扩大所有列:
\ndata_long %>% \n pivot_wider(names_from = rater, values_from = c(v1, v2, v3, v100, vA, vB, vC, vZZ))\nRun Code Online (Sandbox Code Playgroud)\n但是,如果我有很多列,这会变得很麻烦。另一种方法是通过指定列范围来扩大列:
\ndata_long %>% \n pivot_wider(names_from = rater, values_from = v1:vZZ)\nRun Code Online (Sandbox Code Playgroud)\n但是,如果要加宽的所有列不在一个范围内,例如,如果 ID 列散布在整个数据框中(尽管可以指定多个范围),则此方法效果不佳。
\n有没有一种方法可以用来pivot_wider()扩大所有列,除了我指定为使用id_cols(即ID和time)唯一标识每个观察的列的任何列。I\xe2\x80\x99d 喜欢该解决方案可以扩展到我有很多列的情况(因此不想指定变量名称或要扩大的变量范围)。
正如我们所知,前 3 列应该是固定的,-在这些列名称上使用values_from
library(dplyr)
library(tidyr)
data_long %>%
pivot_wider(names_from = rater, values_from = -names(.)[1:3])
Run Code Online (Sandbox Code Playgroud)
或者如果我们已经创建了一个对象
id_cols <- c("ID", "time")
data_long %>%
pivot_wider(names_from = rater, values_from = -all_of(id_cols))
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
3646 次 |
| 最近记录: |