来自字符向量的数据帧，其中变量名称及其数据共同存储

Question

来自字符向量的数据帧，其中变量名称及其数据共同存储

我有这样的情况：

foo <- data.frame("vars" = c("animal: mouse | wks: 12 | site: cage | PI: 78",
                            "animal: dog | wks: 32 | GI: 0.2",
                            "animal: cat | wks: 8 | site: wild | PI: 13"))

Run Code Online (Sandbox Code Playgroud)

其中变量名称和相关数据存储在字符串中，如上例所示。特别是，每个variable_name/its_data单元都由|. 之后:是相关数据。

我想要一个像这样的最终数据框：

  animal  wks  site  PI   GI
  mouse   12   cage  78   NA
    dog   32   <NA>  NA  0.2
    cat    8   wild  13   NA

Run Code Online (Sandbox Code Playgroud)

Answer 1

akr*_*run 7

我们可以使用read.dcf来自base R

out <- type.convert(as.data.frame(read.dcf(
    textConnection(paste(gsub("\\s+\\|\\s+", "\n", foo$vars), 
    collapse="\n\n")))), as.is = TRUE)

Run Code Online (Sandbox Code Playgroud)

-输出

> out
  animal wks site PI  GI
1  mouse  12 cage 78  NA
2    dog  32 <NA> NA 0.2
3    cat   8 wild 13  NA
> str(out)
'data.frame':   3 obs. of  5 variables:
 $ animal: chr  "mouse" "dog" "cat"
 $ wks   : int  12 32 8
 $ site  : chr  "cage" NA "wild"
 $ PI    : int  78 NA 13
 $ GI    : num  NA 0.2 NA

Run Code Online (Sandbox Code Playgroud)

Answer 2

Tar*_*Jae 5

这是一个dplyr解决方案：

library(dplyr)
library(tidyr)

tibble(foo) %>%
  mutate(row = row_number()) %>% 
  separate_rows(vars, sep = '\\|') %>% 
  separate(vars, c("a", "b"), sep = '\\:') %>% 
  mutate(across(everything(), str_trim)) %>% 
  group_by(a) %>% 
  pivot_wider(names_from = a, values_from = b) %>% 
  type.convert(as.is = TRUE) %>% 
  select(-row)

Run Code Online (Sandbox Code Playgroud)

  animal   wks site     PI    GI
  <chr>  <int> <chr> <int> <dbl>
1 mouse     12 cage     78  NA  
2 dog       32 NA       NA   0.2
3 cat        8 wild     13  NA

Run Code Online (Sandbox Code Playgroud)

归档时间：	4 年，1 月前
查看次数：	346 次
最近记录：	4 年，1 月前