在 R/tidyverse 中使用pivot_wider() 将所有列转得更宽(ID 列除外)

itp*_*sen 2 r data-manipulation dataframe tidyr tidyverse

I\xe2\x80\x99m 尝试将R. 我正在尝试使用 使所有列更宽(唯一标识观察值的列除外)pivot_wider()。这是一个最小的工作示例:

\n
library("tidyr")\n\nset.seed(12345)\n\nsampleSize <- 10\ntimepoints <- 3\nraters <- 2\n\ndata_long <- data.frame(ID = rep(1:sampleSize, each = timepoints * raters),\n                        time = rep(1:timepoints, times = sampleSize * raters),\n                        rater = rep(c("a","b"), times = sampleSize * timepoints),\n                        v1 = sample.int(99, sampleSize * timepoints * raters, replace = TRUE),\n                        v2 = sample.int(99, sampleSize * timepoints * raters, replace = TRUE),\n                        v3 = sample.int(99, sampleSize * timepoints * raters, replace = TRUE),\n                        v100 = sample.int(99, sampleSize * timepoints * raters, replace = TRUE),\n                        vA = sample.int(99, sampleSize * timepoints * raters, replace = TRUE),\n                        vB = sample.int(99, sampleSize * timepoints * raters, replace = TRUE),\n                        vC = sample.int(99, sampleSize * timepoints * raters, replace = TRUE),\n                        vZZ = sample.int(99, sampleSize * timepoints * raters, replace = TRUE))\n
Run Code Online (Sandbox Code Playgroud)\n

以下是数据:

\n
> tibble(data_long)\n# A tibble: 60 x 11\n      ID  time rater    v1    v2    v3  v100    vA    vB    vC   vZZ\n   <int> <int> <chr> <int> <int> <int> <int> <int> <int> <int> <int>\n 1     1     1 a        14    56    30    75    66    22     8    73\n 2     1     1 b        90    44    99     8    36    72     1    78\n 3     1     2 a        92    35    93    46     4    68    39    52\n 4     1     2 b        51    91    50    67    43    72    99    74\n 5     1     3 a        80    34    31    31    21    52     7    23\n 6     1     3 b        24    86    25    86    20    43    74    89\n 7     2     1 a        58    51    48    60     6    56    66    37\n 8     2     1 b        96    95    76     1    78     2    65     3\n 9     2     2 a        88    26    92    86     7    37    84    15\n10     2     2 b        93    55    25    62    27    39    73    85\n# ... with 50 more rows\n
Run Code Online (Sandbox Code Playgroud)\n

在此示例中,我有三列唯一标识所有观测值:IDtimerater。I\xe2\x80\x99d 喜欢加宽每隔一列rater(即,排除IDtime列)。我的预期输出是:

\n
# A tibble: 30 x 18\n      ID  time  v1_a  v1_b  v2_a  v2_b  v3_a  v3_b v100_a v100_b  vA_a  vA_b  vB_a  vB_b  vC_a  vC_b vZZ_a vZZ_b\n   <int> <int> <int> <int> <int> <int> <int> <int>  <int>  <int> <int> <int> <int> <int> <int> <int> <int> <int>\n 1     1     1    14    90    56    44    30    99     75      8    66    36    22    72     8     1    73    78\n 2     1     2    92    51    35    91    93    50     46     67     4    43    68    72    39    99    52    74\n 3     1     3    80    24    34    86    31    25     31     86    21    20    52    43     7    74    23    89\n 4     2     1    58    96    51    95    48    76     60      1     6    78    56     2    66    65    37     3\n 5     2     2    88    93    26    55    92    25     86     62     7    27    37    39    84    73    15    85\n 6     2     3    75     2    23    55    28     8     66     74    65    92    58    10    91    65     7    44\n 7     3     1    86    94     7    87    78    85     38     87    36    49    89    83    33    34    32    38\n 8     3     2    10    75    12    15    21    18     56     77    54    17    61    92    18    50    98    27\n 9     3     3    38    81    46    90    20    47     88     15    33    95    66    19    12    27    84    52\n10     4     1    32    38    88    68    77    71     10     81    21    54    33    16    90    41    29    72\n# ... with 20 more rows\n
Run Code Online (Sandbox Code Playgroud)\n

我可以使用以下语法扩大任何给定的列:

\n
data_long %>% \n  pivot_wider(names_from = rater, values_from = c(v1, v2))\n
Run Code Online (Sandbox Code Playgroud)\n

因此,我可以通过在向量中手动输入所有列来扩大所有列:

\n
data_long %>% \n  pivot_wider(names_from = rater, values_from = c(v1, v2, v3, v100, vA, vB, vC, vZZ))\n
Run Code Online (Sandbox Code Playgroud)\n

但是,如果我有很多列,这会变得很麻烦。另一种方法是通过指定列范围来扩大列:

\n
data_long %>% \n  pivot_wider(names_from = rater, values_from = v1:vZZ)\n
Run Code Online (Sandbox Code Playgroud)\n

但是,如果要加宽的所有列不在一个范围内,例如,如果 ID 列散布在整个数据框中(尽管可以指定多个范围),则此方法效果不佳。

\n

有没有一种方法可以用来pivot_wider()扩大所有列,除了我指定为使用id_cols(即IDtime)唯一标识每个观察的列的任何列。I\xe2\x80\x99d 喜欢该解决方案可以扩展到我有很多列的情况(因此不想指定变量名称或要扩大的变量范围)。

\n

akr*_*run 6

正如我们所知,前 3 列应该是固定的,-在这些列名称上使用values_from

library(dplyr)
library(tidyr)
data_long %>% 
   pivot_wider(names_from = rater, values_from = -names(.)[1:3])
Run Code Online (Sandbox Code Playgroud)

或者如果我们已经创建了一个对象

id_cols <- c("ID", "time")
data_long %>%
    pivot_wider(names_from = rater, values_from = -all_of(id_cols))
Run Code Online (Sandbox Code Playgroud)