如何将不等长的向量列表转换为二进制数据帧?

mon*_*nic 2 r dplyr

如何将不等长的向量列表转换为二进制数据帧?我的数据集如下所示:

> gene_annot
[[1]]
 [1] "lipid binding"           "catalytic activity"      "hydrolase activity"      "lipid metabolic process"
 [5] "cytosol"                 "organelle"               "mitochondrion"           "signaling"              
 [9] "extracellular region"    "extracellular space"    

[[2]]
[1] "extracellular region" "extracellular space"  "organelle"           

[[3]]
[1] "extracellular region" "extracellular space" 

[[4]]
logical(0)

[[5]]
[1] "organelle"                          "nucleus"                            "nucleoplasm"                       
[4] "immune system process"              "defense response to other organism" "protein folding" 
Run Code Online (Sandbox Code Playgroud)

我想为每个标签创建一列,每个单元格都包含一个二进制变量,指示该标签是否出现在该行中。我怎样才能在 R 中做到这一点?例如,我期望这样的数据框:

> gene_annot_binary
lipid binding extracellular region

1             1

0             1

0             1

0             0

0             0
Run Code Online (Sandbox Code Playgroud)

Ony*_*mbu 5

转置以下内容:

table(stack(setNames(gene_annot, seq_along(gene_annot))))
                                    ind
values                               1 2 3 4 5
  catalytic activity                 1 0 0 0 0
  cytosol                            1 0 0 0 0
  defense response to other organism 0 0 0 0 1
  extracellular region               1 1 1 0 0
  extracellular space                1 1 1 0 0
  hydrolase activity                 1 0 0 0 0
  immune system process              0 0 0 0 1
  lipid binding                      1 0 0 0 0
  lipid metabolic process            1 0 0 0 0
  mitochondrion                      1 0 0 0 0
  nucleoplasm                        0 0 0 0 1
  nucleus                            0 0 0 0 1
  organelle                          1 1 0 0 1
  protein folding                    0 0 0 0 1
  signaling                          1 0 0 0 0
Run Code Online (Sandbox Code Playgroud)

上面的结果t(.)就是您所需要的。我没有转置,因为它很宽。

编辑:

如果你需要一个数据框:

as.data.frame.matrix(table(stack(setNames(gene_annot, seq_along(gene_annot)))[2:1]))

  catalytic activity cytosol defense response to other organism extracellular region extracellular space
1                  1       1                                  0                    1                   1
2                  0       0                                  0                    1                   1
3                  0       0                                  0                    1                   1
4                  0       0                                  0                    0                   0
5                  0       0                                  1                    0                   0
Run Code Online (Sandbox Code Playgroud)