我创建了一个包含以下Rsweave代码的表.
<<fig=FALSE,results=tex ,echo=FALSE>>=
require('xtable')
res.table<-xtable(myfamDF, caption = 'Paired t-test of most common TF families', caption.placement="top", display = c('f','s','e','e','e','d','f','f','f'), table.placement="")
print(res.table, scalebox=0.7)
@
Run Code Online (Sandbox Code Playgroud)
可以看出,我使用代码caption.placement ='top',但这不起作用.我一直把我的标题放在桌子下面.出了什么问题?我试图在标题之前放置命令,但仍然不起作用.
我的表数据:
structure(list(X = c("ETS", "FH", "HLH", "HMG", "Homeo", "Homeo ",
"Homeo, POU", "IRF", "unknown", "Zn2Cys6", "ZnF_C2H2", "ZnF_C4"
), MASHvstRap = c(7.57756175712832e-05, 2.16501764489381e-05,
1.28838720843028e-05, 7.61948145586808e-26, 2.60621688448055e-53,
5.65846675050138e-11, 2.8351421540276e-06, 2.16501764489381e-05,
3.2934292268274e-24, 2.82352692734938e-05, 6.64390583188061e-16,
1.0825088224469e-05), MASHvsBEEML = c(0.000205676676264912, 0.00519604234774513,
0.00724381285695056, 0.864846903741683, 5.63594927321681e-06,
0.212004750633662, 0.519032309279987, 0.0114962436943861, 0.0364539615325715,
0.00226912148014415, 0.00150554384087195, 0.165493948775683),
tRapvsBEEML = c(1.0825088224469e-05, 1.0825088224469e-05,
5.2304674730304e-05, 3.24328889627148e-13, 8.6852178695266e-46,
7.60650709869649e-06, 2.8351421540276e-06, 1.0825088224469e-05, …Run Code Online (Sandbox Code Playgroud) 我有一个用GGplot2制作的情节.现在,当我想在图中更改文本点的大小时,文本的大小不会改变.我使用以下代码行:
ggplot(data = out, aes(x = V2, y = V1)) +
****geom_text(data = out[!is.na(out$V1),], aes(label = labels, alpha=0.3, size=0.1))**** +
facet_grid(id1 ~ id2,scales="fixed")+
geom_text(data=df.text,aes(pos,pos,label=id1)) + geom_abline( slope=1 ) +
ggtitle("Corralation between measured & calculated affinities") +
ylab("") + xlab("") + theme(panel.grid.minor.x=element_blank(), panel.grid.major.x=element_blank())
}
Run Code Online (Sandbox Code Playgroud)
我把**放在兴趣线的开始和结束之间.我知道大小是改变的正确参数,但是为什么我的文本不会在例如大小= 0.01时发生变化.
我有一个非常慢的for循环并且无法正常工作,它在1 data.frame中查找条形码,而不是在另一个data.frame中搜索该条形码.第二个data.frame的bar_code可以有多次.每次找到条形码时,计数器应计算条形码所在的次数,并将条形码数写入第1个数据帧.
我的尝试:
for(i in 1:length(tcgadataUniek$Tumor_Sample_Barcode)){
for(j in 1:length(hprdDataSorted$Samples.Int1)){
count<-0
if(i==j){
count<-count+1
} else {
count<-count+0
}
hprdDataSorted$Samples.Int2<-count[j]
}
}
Run Code Online (Sandbox Code Playgroud)
1st Data.Frame看起来如下(csv):
HUGO.Int1,HUGO.Int2,barcode.Int1
A1CF,APOBEC1,TCGA-B6-A0RS-01A-11D-A099-09
A1CF,TNPO2,TCGA-B6-A0RS-01A-11D-A099-09
A1CF,SYNCRIP,TCGA-B6-A0RS-01A-11D-A099-09
A1CF,KHSRP,TCGA-B6-A0RS-01A-11D-A099-09
A2M,SHBG,TCGA-D8-A1JK-01A-11D-A13L-09
A2M,C11orf58,TCGA-D8-A1JK-01A-11D-A13L-09
A2M,ATF7IP,TCGA-D8-A1JK-01A-11D-A13L-09
AAMP,TH1L,TCGA-A8-A08S-01A-11W-A050-09
AARS,EEF1B2,TCGA-AO-A0JC-01A-11W-A071-09
Run Code Online (Sandbox Code Playgroud)
包含重复条形码的第二个Data.frame(csv)
Sample_Barcode
TCGA-A8-A08G-01A-11W-A019-09
TCGA-AO-A03O-01A-11W-A019-09
TCGA-AO-A03O-01A-11W-A019-09
TCGA-B6-A0RS-01A-11D-A099-09
TCGA-BH-A0HP-01A-12D-A099-09
TCGA-BH-A0HP-01A-12D-A099-09
TCGA-BH-A18H-01A-11D-A12B-09
TCGA-BH-A18H-01A-11D-A12B-09
TCGA-BH-A18J-01A-11D-A12B-09
TCGA-D8-A1JK-01A-11D-A13L-09
TCGA-E2-A1BC-01A-11D-A14G-09
TCGA-E2-A1BC-01A-11D-A14G-09
TCGA-E9-A1NH-01A-11D-A14G-09
TCGA-E9-A22B-01A-11D-A159-09
Run Code Online (Sandbox Code Playgroud)
如果条形码.Int1(数据帧1)中的条形码在Sample_barcode中是3次,则脚本应在条形码旁边添加3,脚本正在寻找.Int1.例如:
HUGO.Int1,HUGO.Int2,barcode.Int1, number_of_times
A1CF,APOBEC1,TCGA-B6-A0RS-01A-11D-A099-09,5
Run Code Online (Sandbox Code Playgroud) 假设我有一个非常大的data.frame包含每列的分数.
例如:
MA0001.1 AGL3 MA0003.1 TFAP2A MA0004.1 Arnt MA0005.1 AG MA0006.1 Arnt::Ahr
7.789524e-09 0.4012127249 3.771518e-03 1.892011e-06 0.002733200
5.032498e-07 0.0001873801 9.947449e-05 3.284222e-05 0.001367041
1.194487e-06 0.0009357406 6.943634e-05 1.589373e-05 0.002551519
4.833494e-06 0.0150703600 1.003488e-04 1.197928e-03 0.001431416
6.865040e-05 0.0000732607 3.857193e-04 5.388744e-03 0.001363706
Run Code Online (Sandbox Code Playgroud)
R data.frame:
testfr<-structure(list(`MA0001.1 AGL3` = c(7.78952366977488e-09, 5.03249791215203e-07,
1.19448739380034e-06, 4.83349413748598e-06, 6.86504034402563e-05
), `MA0003.1 TFAP2A` = c(0.401212724871542, 0.000187380067026448,
0.000935740631438077, 0.0150703600158589, 7.32607018758816e-05
), `MA0004.1 Arnt` = c(0.00377151826447817, 9.94744903768433e-05,
6.94363387424972e-05, 0.000100348764966112, 0.00038571926458373
), `MA0005.1 AG` = c(1.89201084302835e-06, 3.2842217133538e-05,
1.58937284554136e-05, 0.00119792816070882, 0.00538874414923338
), `MA0006.1 Arnt::Ahr` = c(0.00273319966783363, 0.00136704060025893, …Run Code Online (Sandbox Code Playgroud) 我有一个列表列表,其中一些是NA例如empty lists.我想提取所有填充数据的列表并删除所有列表empty(NA).
我正在尝试的代码是:
lapply(outputfile,function(x){
if(outputfile != NA){
test<-lapply(outputfile,unlist)
}})
Run Code Online (Sandbox Code Playgroud)
但这不起作用.
列表列表如下:(随机数据的小例子)
list(NA, NA, NA, NA, NA, NA, list(c(5, 5, 5, 5, 5, 5, 5, 5, 5,
5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5,
5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5,
5, 5, 5, 5, 5, 5, …Run Code Online (Sandbox Code Playgroud) 假设我有一个小的矩阵列表,并希望提取每个矩阵的加入.除了循环attr()函数之外,有没有什么好方法可以做到这一点.
矩阵数据:
tfmatrx<-list(
MA0275.1 = structure(c(0, 76, 0, 24, 0, 100, 0, 0, 0, 0,
100, 0, 0, 0, 100, 0, 72, 11, 16, 0, 53, 0, 0, 47), .Dim = c(4L,
6L), .Dimnames = list(c("A", "C", "G", "T"), NULL), id = "MA0275.1", accession = "ASG1"),
MA0276.1 = structure(c(0, 220, 8, 35, 0, 291, 0, 3, 61, 21,
133, 10, 58, 54, 101, 12, 130, 0, 54, 0, 0, 11, 8, 147, 33,
150, 8, 35, 80, 0, …Run Code Online (Sandbox Code Playgroud) 我有一个包含 2 列的数据框:
.id vals
1 A 10
2 B 20
3 C 30
4 A 100
5 B 200
6 C 300
dput(tst_df)
structure(list(.id = structure(c(1L, 2L, 3L, 1L, 2L, 3L), .Label = c("A",
"B", "C"), class = "factor"), vals = c(10, 20, 30, 100, 200,
300)), .Names = c(".id", "vals"), row.names = c(NA, -6L), class = "data.frame")
Run Code Online (Sandbox Code Playgroud)
现在我想让该.id列成为我的列名称,并且 vals 将成为 2 行。
像这样:
A B C
10 20 30
100 200 300
Run Code Online (Sandbox Code Playgroud)
基本上 .id 是我的分组变量,我希望将属于 1 …
我正在研究一个有时非常慢的Linux服务器.因此,当我添加一些工作为我运行时,我必须等待几个小时才能运行一个简单的计算.
我想知道我是否能够开始下一次分析,但让它等到前一次分析的输出就在那里.(第二次分析需要第一次分析输出)
我试图使except和其他选项工作但仍然没有成功(找到除了以前的问题在stackoverflow上的其他选项):
expect {
'output/analysis_file1.txt'
}
任何想法/提示都会受到赞赏,并会帮助我分配.
我唯一想要的是让第二个脚本等到第一个脚本的文本文件被给出.
4个脚本:1.
#!/bin/bash
#$ -cwd
./script1.sh
. ./script2.sh $repla
. ./script3.sh $replac
Run Code Online (Sandbox Code Playgroud)
2:
repla=''
for i in 'abcdefghijklmnopqrst'
do
repla=`echo $i | sed 's/'abc'/'xyz'/g'`
#echo $repla
done
Run Code Online (Sandbox Code Playgroud)
3:
replac=''
for j in $1
do
replac=`echo $j | sed 's/'xyz'/'san'/g'`
#echo $replac
done
Run Code Online (Sandbox Code Playgroud)
4:
replace=''
for h in $1
do
replace=`echo $h | sed 's/'san'/'sander'/g'`
#echo $replace
done
Run Code Online (Sandbox Code Playgroud) 我有一个包含hold + - 38列表的列表列表.其中只有几个应该被选中(其余的没有值,例如NULL).我想为这些列表制作一个很好的数据框.
我的清单列表:
structure(list(NULL, AFT = NULL, `AP-2` = NULL, `AT_hook, ETS` = NULL,
`BASIC, HLH` = NULL, BRIGHT = NULL, BRLZ = NULL, `BRLZ, BZIP_1, BZIP_2` = NULL,
bZIP = NULL, DWA = NULL, E2F_TDP = NULL, ETS = structure(list(
MASHvstRap = 8.34818462488622e-05, MASHvsBEEML = 0.000250015234002341,
tRapvsBEEML = 8.80480124829088e-06, frequency = 10, stringsAsFactors = 0), .Names = c("MASHvstRap",
"MASHvsBEEML", "tRapvsBEEML", "frequency", "stringsAsFactors"
), row.names = c(NA, -1L), class = "data.frame"), FH = structure(list(
MASHvstRap = 1.72864219357795e-05, MASHvsBEEML …Run Code Online (Sandbox Code Playgroud) 我有一个数据框,其中包含3列数值(p值)和1列频率值(不应该相乘).我想将前3列相乘,让最后一列不变.
my.df:
myfamDF<-structure(list(MASHvstRap = c(3.36388469632471e-14, 4.33277656523673e-123,
3.06769943976602e-08, 6.07022175358029e-30, 4.82890837154273e-32,
4.93181852868703e-06, 1.22573775496788e-08, 1.25502843779857e-05,
1.72864219357795e-05, 4.71138538453502e-05, 8.34818462488622e-05,
1.62205005760679e-17), MASHvsBEEML = c(0.763756578209722, 0.442020719677047,
0.423594358667165, 0.0994358268075855, 0.0736357072352032, 0.0467257430288347,
0.00119919900578073, 0.00094114146973297, 0.000840376826415137,
0.000623286035357452, 0.000250015234002341, 1.46483433509648e-08
), tRapvsBEEML = c(3.75944533892572e-07, 8.44025048683083e-74,
7.51004008659922e-09, 5.3728011843321e-09, 7.20783906680568e-26,
6.69189512726035e-07, 3.60117573203279e-07, 1.17030570144044e-06,
2.54589884424594e-07, 3.93333369828925e-07, 8.80480124829088e-06,
2.89656372293867e-25), frequency = c(19, 158, 11, 44, 121, 10,
13, 10, 10, 17, 10, 54)), .Names = c("MASHvstRap", "MASHvsBEEML",
"tRapvsBEEML", "frequency"), row.names = c("Homeo ", "Homeo",
"Homeo, POU", "HMG", "unknown", "ZnF_C4", "HLH", "IRF", "FH",
"Zn2Cys6", "ETS", …Run Code Online (Sandbox Code Playgroud) 我尝试使用 API 将数据集加载到 R 中,该 API 允许我执行查询并返回我需要的数据(我无法在服务器端进行配置)。
\n\n我知道这与编码有关。当我通过 R 中的数据帧检查字符串时,给我 ENC:UTF-8“Cosm\xc3\x83\xc2\xa9tica”。
当我复制源字符串“Cosm\xc3\xa9tica”时,它给了我latin1.
如何获得像 latin1 一样正确格式化的 UTF-8 字符串?\n 我已经尝试过以下操作:
\n\nSys.setlocale("LC_ALL","Spanish")
直接在字符串上尝试:
\n\nEnconding(Description) <- "latin1"
不幸的是我无法让它工作。欢迎任何想法!谢谢。
\n