我有一个 R 数据框如下-
df <- data.frame(
FDR = c (0.009, 0.007, 0.007),
Probe_ID = c("1555272_at", "1557203_at", "1557384_at"),
Gene.Symbol = c("RSPH10B2///RSPH10B","PABPC1L2B///PABPC1L2A","LOC100506639///ZNF131"),
Gene.ID = c("728194///222967","645974///340529","100506639///7690"))
df
FDR Probe_ID Gene.Symbol Gene.ID
1 0.009 1555272_at RSPH10B2///RSPH10B 728194///222967
2 0.007 1557203_at PABPC1L2B///PABPC1L2A 645974///340529
3 0.007 1557384_at LOC100506639///ZNF131 100506639///7690
Run Code Online (Sandbox Code Playgroud)
我想根据 列 的行值和df$Gene.symbol模式分割数据框///。结果数据框应如下所示 -
FDR Probe_ID Gene.symbol Gene.ID
0.009 15111_at RSPH10B2 728194
0.009 15111_at RSPH10B 222967
0.007 15222_at PABPC1L2B 645974
0.007 15222_at PABPC1L2A 340529
0.007 15333_at LOC100506639 100506639
0.007 15333_at ZNF131 7690
Run Code Online (Sandbox Code Playgroud)
我尝试了以下代码,但它不起作用并生成了具有重复元素的列-
s <- strsplit(gsub("///","",df$Gene.symbol),", ",fixed = TRUE)
res <- data.frame(Id = rep(df$Gene.symbol, lengths(s)), result = unlist(s))
ans <- merge(annotated,res)
Run Code Online (Sandbox Code Playgroud)
提前致谢!
解决方案dplyr:
library(dplyr)
df %>%
separate_rows(Gene.Symbol, Gene.ID, sep = "///")
# A tibble: 6 x 4
FDR Probe_ID Gene.Symbol Gene.ID
<dbl> <chr> <chr> <chr>
1 0.009 1555272_at RSPH10B2 728194
2 0.009 1555272_at RSPH10B 222967
3 0.007 1557203_at PABPC1L2B 645974
4 0.007 1557203_at PABPC1L2A 340529
5 0.007 1557384_at LOC100506639 100506639
6 0.007 1557384_at ZNF131 7690
Run Code Online (Sandbox Code Playgroud)