我有数百个文件,命名如下:
RG1-t.txt
RG1-n.txt
RG2-t.txt
RG2-n.txt
等等...
我想使用GNU并行在它们上运行脚本,但是我很难获取文件的基本名称,因此RG1,RG2等...这样我就可以运行:
ls RG*.txt | parallel "command.sh {basename}-t.txt {basename}-n.txt > {basename}.out"
Run Code Online (Sandbox Code Playgroud)
导致文件RG1.out,RG2.out等。有什么想法吗?
我有名称相同但位于不同文件夹中的文件。Nextflow 将这些文件暂存在同一工作目录中,从而导致名称冲突。我的问题是如何在不重命名文件的情况下处理这个问题。例子:
# Example data
mkdir folder1 folder2
echo 1 > folder1/file.txt
echo 2 > folder2/file.txt
# We read from samplesheet
$ cat samplesheet.csv
sample,file
sample1,/home/atpoint/foo/folder1/file.txt
sample1,/home/atpoint/foo/folder2/file.txt
# Nextflow main.nf
#! /usr/bin/env nextflow
nextflow.enable.dsl=2
// Read samplesheet and group files by sample (first column)
samplesheet = Channel
.fromPath(params.samplesheet)
.splitCsv(header:true)
.map {
sample = it['sample']
file = it['file']
tuple(sample, file)
}
ch_samplesheet = samplesheet.groupTuple(by:0)
// That creates a tuple like:
// [sample1, [/home/atpoint/foo/folder1/file.txt, /home/atpoint/foo/folder2/file.txt]]
// Dummy process that stages …Run Code Online (Sandbox Code Playgroud)