如何找到重叠坐标并提取重叠区域的各个seg.mean值?
data1
Rl pValue chr start end CNA
2 2.594433 6 129740000 129780000 gain
2 3.941399 6 130080000 130380000 gain
1 1.992114 10 80900000 81100000 gain
1 7.175750 16 44780000 44920000 gain
Run Code Online (Sandbox Code Playgroud)
DATA2
ID chrom loc.start loc.end num.mark seg.mean
8410 6 129750000 129760000 8430 0.0039
8410 10 80907000 81000000 5 -1.7738
8410 16 44790000 44910000 12 0.0110
Run Code Online (Sandbox Code Playgroud)
DataOutput中
Rl pValue chr start end CNA seg.mean
2 2.594433 6 129750000 129760000 gain 0.0039
1 1.992114 10 80907000 81000000 gain -1.7738
1 …Run Code Online (Sandbox Code Playgroud) 我正在尝试在我们的集群中安装 R(集群的操作系统是 Red Hat Enterprise Linux 6),我没有 root 访问权限。我试过:
$wget http://cran.rstudio.com/src/base/R-3/R-3.1.1.tar.gz
$ tar xvf R-3.1.1.tar.gz
$ cd R-3.1.1
$ ./configure --prefix=/home/Kryo/R-3.1.1
Run Code Online (Sandbox Code Playgroud)
但得到错误:
配置:错误:--with-x=yes(默认)和 X11 头文件/库不可用
我该如何拆分这个
Chr3:153922357-153944632(-)
Chr11:70010183-70015411(-)
Run Code Online (Sandbox Code Playgroud)
进入
Chr3 153922357 153944632 -
Chr11 70010183 70015411 -
Run Code Online (Sandbox Code Playgroud)
我试过了strsplit(df$V1,"[[:punct:]]")),但最终结果中没有出现负号
如何通过基因名称比较两个数据集df1和df2,并从df2中提取每个基因名称的相应值并将其插入到df1中
df1 <-
Genes sample.ID chrom loc.start loc.end num.mark
Klri2 LO.WGS 1 3010000 173490000 8430
Rrs1 LO.WGS 1 3010000 173490000 8430
Serpin LO.WGS 1 3010000 173490000 8430
Myoc LO.WGS 1 3010000 173490000 8430
St18 LO.WGS 1 3010000 173490000 8430
df2 <-
RL pValue. chr start end CNA Genes
2 2.594433 1 129740006 129780779 gain Klri2
2 3.941399 1 130080653 130380997 gain Serpin,St18,Myoc
df3<-
Genes sample.ID chrom loc.start loc.end num.mark RL pValue CNA
Klri2 LO.WGS 1 3010000 173490000 8430 2 2.594433 …Run Code Online (Sandbox Code Playgroud)