使用另一个表中的数据向表添加列

use*_*857 11 r dataframe

我有一个表格如下:

Table1 <- data.frame(
    "Random" = c("A", "B", "C"), 
    "Genes" = c("Apple", "Candy", "Toothpaste"), 
    "Extra" = c("Up", "", "Down"), 
    "Desc" = c("Healthy,Red,Fruit", "Sweet,Cavities,Sugar,Fruity", "Minty,Dentist")
)
Run Code Online (Sandbox Code Playgroud)

赠送:

  Random      Genes Extra                       Desc
1      A      Apple    Up          Healthy,Red,Fruit
2      B      Candy       Sweet,Cavities,Sugar,Fruity
3      C Toothpaste  Down              Minty,Dentist
Run Code Online (Sandbox Code Playgroud)

我有另一个包含描述的表,并希望添加Genes列.例如,Table2将是:

Table2 <- data.frame(
    "Col1" = c(1, 2, 3, 4, 5, 6), 
    "Desc" = c("Sweet", "Sugar", "Dentist", "Red", "Fruit", "Fruity")
)
Run Code Online (Sandbox Code Playgroud)

赠送:

  Col1    Desc
1    1   Sweet
2    2   Sugar
3    3 Dentist
4    4     Red
5    5   Fruit
6    6  Fruity
Run Code Online (Sandbox Code Playgroud)

我想在Table2中添加另一个名为"Genes"的列,它与两个表中的"Desc"相匹配,并添加Table1中的Genes来获取:

  Col1    Desc    Gene
1    1   Sweet    Candy
2    2   Sugar    Candy
3    3 Dentist    Toothpaste
4    4     Red    Apple
5    5   Fruit    Apple
6    6  Fruity    Candy
Run Code Online (Sandbox Code Playgroud)

akr*_*run 8

你可以尝试cSplitsplitstackshape在"表1"分裂"说明"栏和"宽"转换数据集"长"格式.输出将是一个data.table.我们可以使用data.table方法将键列设置为'Desc'(setkey),与"Table2"连接,最后通过选择列或将:=不需要的列分配给NULL来删除输出中不需要的列

library(splitstackshape)
setkey(cSplit(Table1, 'Desc', ',', 'long'),Desc)[Table2[2:1]][
                   ,c(5,4,2), with=FALSE]
#  Col1    Desc      Genes
#1:    1   Sweet      Candy
#2:    2   Sugar      Candy
#3:    3 Dentist Toothpaste
#4:    4     Red      Apple
#5:    5   Fruit      Apple
#6:    6  Fruity      Candy
Run Code Online (Sandbox Code Playgroud)


Jth*_*rpe 5

以下是基本R中使用中间链接表的方法:

# create an intermediate data.frame with all the key (Desc) / value (Gene) pairs
df  <-  NULL
for(i in seq(nrow(Table1)))
    df  <-  rbind(df,
                  data.frame(Gene =Table1$Genes[i],
                            Desc =strsplit(as.character(Table1$Desc)[i],',')[[1]]))
df 
#>         Gene     Desc
#> 1      Apple  Healthy
#> 2      Apple      Red
#> 3      Apple    Fruit
#> 4      Candy    Sweet
#> 5      Candy Cavities
#> 6      Candy    Sugar
#> 7      Candy   Fruity
#> 8 Toothpaste    Minty
#> 9 Toothpaste  Dentist
Run Code Online (Sandbox Code Playgroud)

现在以通常的方式链接到它:

Table2$Gene  <-  df$Gene[match(Table2$Desc,df$Desc)]
Run Code Online (Sandbox Code Playgroud)