将一列分成多个变量，在 R 中具有唯一的列名

Question

将一列分成多个变量，在 R 中具有唯一的列名

这是我希望我的数据框的外观：

record    color    size    height    weight
1         blue     large             heavy
1         red                        
2         green    small   tall      thin

Run Code Online (Sandbox Code Playgroud)

但是，数据 (df) 显示如下：

record    vars
1         color = "blue", size = "large"
2         color = "green", size = "small"
2         height = "tall", weight = "thin"
1         color = "red", weight = "heavy"

Run Code Online (Sandbox Code Playgroud)

df 的代码

structure(list(record = c(1L, 2L, 2L, 1L), vars = structure(c(1L, 
                                                              2L, 4L, 
3L), .Label = c("color = \"blue\", size = \"large\"", 

"color = \"green\", size = \"small\"", "color = \"red\", weight = 
\"heavy\"", 

"height = \"tall\", weight = \"thin\""), class = "factor")), class = 
"data.frame", row.names = c(NA, 

-4L))

Run Code Online (Sandbox Code Playgroud)

对于每条记录，我想用“,”分隔符分隔 vars 列，并使用指定的变量名称创建一个新列...如果特定变量有多个值，则应重复记录

我知道要使用 tidyverse 执行此操作，我需要使用 dplyr::group_by 和 dplyr::separate，但是我不清楚如何将新变量名称合并到“into”参数中以进行分离。我是否需要某种类型的正则表达式来将等号“=”之前的任何文本标识为“into”中的新变量名称？非常欢迎任何建议！

df %>%
  group_by(record) %>%
  separate(col = vars, into = c(regex expression?? / character vector?), sep = ",")

Run Code Online (Sandbox Code Playgroud)

Answer 1

Ice*_*can 6

由于列几乎已经写成定义列表的 R 代码，您可以解析/评估它们，然后 unnest_wider

library(tidyverse)

df %>% 
  mutate(vars = map(vars, ~ eval(parse_expr(paste('list(', .x, ')'))))) %>% 
  unnest_wider(vars)

# record color size  height weight
#    <int> <chr> <chr> <chr>  <chr> 
# 1      1 blue  large NA     NA    
# 2      2 green small NA     NA    
# 3      2 NA    NA    tall   thin

Run Code Online (Sandbox Code Playgroud)

归档时间：	6 年，1 月前
查看次数：	957 次
最近记录：	6 年前