我有几个基因特征载体,其中包含了它们所在物种的名称,我制作了一个UpSetR图,显示了基因间共同的物种数量.现在我想做相反的事情:绘制物种间共同基因的数量,但我不知道该怎么做.
我的例子:
gene1 <- c("Panda", "Dog", "Chicken")
gene2 <- c("Human", "Panda", "Dog")
gene3 <- c("Human", "Panda", "Chicken")
...#About 20+ genes with 100+ species each
Run Code Online (Sandbox Code Playgroud)
我希望得到的结果示例:
Panda <- c("gene1", "gene2", "gene3")
Dog <- c("gene1", "gene2")
Human <- c("gene2", "gene3")
Chicken <- c("gene1", "gene3")
...
Run Code Online (Sandbox Code Playgroud)
我知道它在概念上很容易,但后勤更复杂.任何人都可以给我一个线索吗?
谢谢!
您可以使用unstack基数R:
unstack(stack(mget(ls(pattern="gene"))),ind~values)
$Chicken
[1] "gene1" "gene3"
$Dog
[1] "gene1" "gene2"
$Human
[1] "gene2" "gene3"
$Panda
[1] "gene1" "gene2" "gene3"
Run Code Online (Sandbox Code Playgroud)
您最终可以按list2env功能将此列表添加到环境中
分解:
l = mget(ls(pattern="gene"))#get all the genes in a list
m = unstack(stack(l),ind~values)# Stack them, then unstack with the required formula
m
$Chicken
[1] "gene1" "gene3"
$Dog
[1] "gene1" "gene2"
$Human
[1] "gene2" "gene3"
$Panda
[1] "gene1" "gene2" "gene3"
list2env(m,.GlobalEnv)
Dog
[1] "gene1" "gene2"
Run Code Online (Sandbox Code Playgroud)