我正在使用NCBI参考序列登录号,如变量a:
a <- c("NM_020506.1","NM_020519.1","NM_001030297.2","NM_010281.2","NM_011419.3", "NM_053155.2")
Run Code Online (Sandbox Code Playgroud)
要获得从biomart包我需要删除的信息.1,.2登录号等设备中后.我通常使用以下代码执行此操作:
b <- sub("..*", "", a)
# [1] "" "" "" "" "" ""
Run Code Online (Sandbox Code Playgroud)
但正如您所看到的,这不是这个变量的正确方法.谁能帮我这个?
我有两个data.frames,一个只有字符,另一个有字符和值.
df1 = data.frame(x=c('a', 'b', 'c', 'd', 'e'))
df2 = data.frame(x=c('a', 'b', 'c'),y = c(0,1,0))
merge(df1, df2)
x y
1 a 0
2 b 1
3 c 0
Run Code Online (Sandbox Code Playgroud)
我想合并df1和df2.字符a,b和c合并良好,也有0,1,0但d和e什么都没有.我想d和e也在合并表中,0 0条件.因此,对于df2 data.frame中的每个缺失行,0必须放在df1表中,如:
x y
1 a 0
2 b 1
3 c 0
4 d 0
5 e 0
Run Code Online (Sandbox Code Playgroud) 在下面的示例中,我有两个数据集(Z和A).我想通过ILMN号码合并或组合这些集合.如果没有匹配,请填写NA.
z <- matrix(c(0,0,1,1,0,0,1,1,0,0,0,0,1,0,1,1,0,1,1,1,1,0,0,0,"RND1","WDR", "PLAC8","TYBSA","GRA","TAF"), nrow=6,
dimnames=list(c("ILMN_1651838","ILMN_1652371","ILMN_1652464","ILMN_1652952","ILMN_1653026","ILMN_1653103"),c("A","B","C","D","symbol")))
t<-matrix(c("GO:0002009", 8, 342, 1, 0.07, 0.679, 0, 0, 1, 0,
"GO:0030334", 6, 343, 1, 0.07, 0.065, 0, 0, 1, 0,
"GO:0015674", 7, 350, 1, 0.07, 0.065, 1, 0, 0, 0), nrow=10, dimnames= list(c("GO.ID","LEVEL","Annotated","Significant","Expected","resultFisher","ILMN_1652464","ILMN_1651838","ILMN_1711311","ILMN_1653026")))
Run Code Online (Sandbox Code Playgroud)
结果将是这样的:
[,1] [,2] [,3] [,4]
GO.ID "GO:0002009" "GO:0030334" "GO:0015674" NA
LEVEL "8" "6" "7" NA
Annotated "342" "343" "350" NA
Significant "1" "1" "1" NA
Expected "0.07" "0.07" "0.07" NA
resultFisher "0.679" "0.065" "0.065" NA
ILMN_1652464 "0" "0" "1" PLAC8 …Run Code Online (Sandbox Code Playgroud) 我使用R中的topGO包来分析基因富集,使用以下代码:
sampleGOdata <- new("topGOdata", description = "Simple session", ontology = "BP",
allGenes = geneList, geneSel = topDiffGenes, nodeSize = 10,
annot = annFUN.db, affyLib = affyLib)
resultFisher <- runTest(sampleGOdata, algorithm = "classic", statistic = "fisher")
allRes <- GenTable(sampleGOdata, classicFisher = resultFisher, orderBy = "fisher",
ranksOf = "classicFisher",topNodes = 10)
Run Code Online (Sandbox Code Playgroud)
我想看到和更改的RunTest功能和GenTable更改的功能ResultTable,但我不知道如何表达的功能.随着getAnywhere("GenTable")我没有得到我想要的硬代码.
getAnywhere("GenTable")
Run Code Online (Sandbox Code Playgroud)
找到了匹配"GenTable"的单个对象
它在以下地方被发现
Run Code Online (Sandbox Code Playgroud)package:topGO namespace:topGO有价值的
Run Code Online (Sandbox Code Playgroud)function (object, ...) standardGeneric("GenTable") <environment: 0x16a30c10> attr(,"generic") [1] "GenTable" attr(,"generic")attr(,"package") [1] "topGO" attr(,"package") [1] "topGO" attr(,"group") …
我有一个关于提取字符串的一部分的问题.例如,我有一个这样的字符串:
a <- "DP=26;AN=2;DB=1;AC=1;MQ=56;MZ=0;ST=5:10,7:2;CQ=SYNONYMOUS_CODING;GN=NOC2L;PA=1^1:0.720&2^1:0"
Run Code Online (Sandbox Code Playgroud)
我需要在GN=和之间提取所有内容;.所以它会在这里NOC2L.
那可能吗?
注意:这是INFO列形式的VCF文件格式.GN是基因名称,因此我们想从INFO列中提取基因名称.
我想知道是否有办法计算绘图中的abline和数据点之间的距离?例如,concentration == 40与signal == 643(元素5)和基线之间的距离是多少?
concentration <- c(1,10,20,30,40,50)
signal <- c(4, 22, 44, 244, 643, 1102)
plot(concentration, signal)
res <- lm(signal ~ concentration)
abline(res)
Run Code Online (Sandbox Code Playgroud) 我有这样的数据帧:
V1 V2 V3
1 1 3423086 3423685
2 1 3467184 3467723
3 1 4115236 4115672
4 1 5202437 5203057
5 2 7132558 7133089
6 2 7448688 7449283
Run Code Online (Sandbox Code Playgroud)
我想更改V1列并在数字前添加chr.像这样:
V1 V2 V3
1 chr1 3423086 3423685
2 chr1 3467184 3467723
3 chr1 4115236 4115672
4 chr1 5202437 5203057
5 chr2 7132558 7133089
6 chr2 7448688 7449283
Run Code Online (Sandbox Code Playgroud)
在R中有办法做到这一点吗?
我有一个关于每行计数零的问题.我有这样的数据帧:
a = c(1,2,3,4,5,6,0,2,5)
b = c(0,0,0,2,6,7,0,0,0)
c = c(0,5,2,7,3,1,0,3,0)
d = c(1,2,6,3,8,4,0,4,0)
e = c(0,4,6,3,8,4,0,6,0)
f = c(0,2,5,5,8,4,2,7,4)
g = c(0,8,5,4,7,4,0,0,0)
h = c(1,3,6,7,4,2,0,4,2)
i = c(1,5,3,6,3,7,0,5,3)
j = c(1,5,2,6,4,6,8,4,2)
DF<- data.frame(a=a,b=b,c=c,d=d,e=e,f=f,g=g,h=h,i=i,j=j)
a b c d e f g h i j
1 1 0 0 1 0 0 0 1 1 1
2 2 0 5 2 4 2 8 3 5 5
3 3 0 2 6 6 5 5 6 3 2
4 4 2 7 …Run Code Online (Sandbox Code Playgroud) 我有一个像x这样的数据框,其中列基因是一个因素.我想删除列基因什么都没有的所有行.所以在表XI中想要删除第4行.有没有办法为大型数据帧执行此操作?
X
names values genes
1 A 0.2876113 EEF1A1
2 B 0.6681894 GAPDH
3 C 0.1375420 SLC35E2
4 D -1.9063386
5 E -0.4949905 RPS28
Run Code Online (Sandbox Code Playgroud)
最后结果:
X
names values genes
1 A 0.2876113 EEF1A1
2 B 0.6681894 GAPDH
3 C 0.1375420 SLC35E2
5 E -0.4949905 RPS28
Run Code Online (Sandbox Code Playgroud)
谢谢你们!