在 R 中将行转换为列

Nic*_*123 5 r dplyr data.table

我有这个示例数据集,我想将其转换为以下格式:

Type <- c("AGE", "AGE", "REGION", "REGION", "REGION", "DRIVERS", "DRIVERS")
Level <- c("18-25", "26-70", "London", "Southampton", "Newcastle", "1", "2")
Estimate <- c(1.5,1,2,3,1,2,2.5)

df_before <- data.frame(Type, Level, Estimate)


     Type       Level Estimate
1     AGE       18-25      1.5
2     AGE       26-70      1.0
3  REGION      London      2.0
4  REGION Southampton      3.0
5  REGION   Newcastle      1.0
6 DRIVERS           1      2.0
7 DRIVERS           2      2.5
Run Code Online (Sandbox Code Playgroud)

基本上,我想将数据集转换为以下格式。我已经尝试过该功能dcast(),但似乎不起作用。

    AGE Estimate_AGE      REGION Estimate_REGION DRIVERS Estimate_DRIVERS
1 18-25          1.5      London               2       1              2.0
2 26-70          1.0 Southampton               3       2              2.5
3  <NA>           NA   Newcastle               1    <NA>               NA
Run Code Online (Sandbox Code Playgroud)

Ony*_*mbu 6

df_before %>%
  group_by(Type) %>%
  mutate(id = row_number(), Estimate = as.character(Estimate))%>%
  pivot_longer(-c(Type, id)) %>%
  pivot_wider(id, names_from = c(Type, name))%>%
  type.convert(as.is = TRUE)

# A tibble: 3 x 7
     id AGE_Level AGE_Estimate REGION_Level REGION_Estimate DRIVERS_Level DRIVERS_Estimate
  <int> <chr>            <dbl> <chr>                  <int>         <int>            <dbl>
1     1 18-25              1.5 London                     2             1              2  
2     2 26-70              1   Southampton                3             2              2.5
3     3 NA                NA   Newcastle                  1            NA             NA  
Run Code Online (Sandbox Code Playgroud)

在数据表中:

library(data.table)
setDT(df_before)

dcast(melt(df_before, 'Type'), rowid(Type, variable)~Type + variable)
Run Code Online (Sandbox Code Playgroud)

请注意,由于类型不匹配,您会收到很多警告。你可以用reshape2::melt它来避免这种情况。

无论如何,您的数据帧不是标准格式。

基数 R >=4.0

transform(df_before, id = ave(Estimate, Type, FUN = seq_along)) |>
  reshape(v.names = c('Level', 'Estimate'), dir = 'wide', timevar = 'Type', sep = "_")

 id Level_AGE Estimate_AGE Level_REGION Estimate_REGION Level_DRIVERS Estimate_DRIVERS
1  1     18-25          1.5       London               2             1              2.0
2  2     26-70          1.0  Southampton               3             2              2.5
5  3      <NA>           NA    Newcastle               1          <NA>               NA
Run Code Online (Sandbox Code Playgroud)

IN 基数 R <4

reshape(transform(df_before, id = ave(Estimate, Type, FUN = seq_along)),
       v.names = c('Level', 'Estimate'), dir = 'wide', timevar = 'Type', sep = "_")
Run Code Online (Sandbox Code Playgroud)