标签: nextflow

在 Nextflow 进程中尝试 catch

如何在 nextflow 中执行 try catch？

我目前正在编写一个管道，其中我正在执行的 bash 命令可能在某些条件下以退出代码 1 退出。这使我的管道陷入停滞。我现在想使用 try catch 子句来定义一些替代行为，以防发生这种情况。

我尝试过以常规方式执行此操作，但似乎不起作用：

process align_kallisto {

    publishDir "${params.outdir}/kallisto", mode: 'copy', saveAs:{ filename -> "${name}_abundance.tsv" }   

    input:
    tuple val(name), file(fastq) from fq_kallisto.dump(tag: 'kallisto fq')
    file(index) from kallisto_index.collect().dump(tag: 'kallisto index')

    output:
    file("output/abundance.tsv") into kallisto_quant

    // this can throw an exit 1 status
    try {
        """
        kallisto quant -i ${index} --bias --single --fr-stranded -o output --plaintext \
          --fragment-length ${params.frag_length} --sd ${params.frag_deviation} ${fastq}
        """
    } 
    // if this happens catch and do something …

Run Code Online (Sandbox Code Playgroud)

groovy nextflow

fal*_*hof

lucky-day

6
推荐指数

1
解决办法

1258
查看次数

你能让 Nextflow DAG 可视化变得漂亮吗？

Nextflow 可以高效地制作复杂的管道。有些人只能通过视觉来理解事物，因此制作良好的图形表示很重要。在 nextflow 中执行此操作的方法是使用 -with-dag 命令：

nextflow run <script-name> -with-dag flowchart.png

Run Code Online (Sandbox Code Playgroud)

然而，输出看起来很糟糕，并且没有任何专业的氛围：

我想知道是否有任何方法可以改进它，例如获取源代码并上传到一些不同的可视化程序。任何事物。

directed-acyclic-graphs nextflow

jus*_*uck

lucky-day

5
推荐指数

1
解决办法

1043
查看次数

创建Conda环境失败状态：143

我试图通过 Linux 命令行对某些数据运行 nexflow 管道，但是当我这样做时，它失败了，因为它无法创建 Conda 环境。

尽管环境设置不正确，但它看起来仍然尝试运行管道，因此会生成错误消息。任何帮助将非常感激。这是错误消息：

Error executing process > 'my_process (1)'
Caused by:
  Failed to create Conda environment
  command: conda env create --prefix /my_file_path-6bf38a923b48a255f96ea3d66d372e6c --file /my_file_path/environment.yml
  status : 143
  message:

Run Code Online (Sandbox Code Playgroud)

这是我的environment.yml 文件：

name: pipeline_name
channels:
  - bioconda
  - conda-forge
  - defaults
dependencies:
  - filtlong
  - blast==2.5
  - minimap2 
  - samtools 
  - pysam 
  - pandas 
  - matplotlib 
  - pysamstats
  - seaborn 
  - medaka
  - bedtools
  - bedops
  - seqtk
  - bioawk
  - sniffles

Run Code Online (Sandbox Code Playgroud)

linux anaconda conda nextflow

use*_*422

2020 12-21

5
推荐指数

1
解决办法

1545
查看次数

Nextflow 教程出现错误“没有这样的变量”

我正在尝试学习 nextflow 但进展不太顺利。我从这个网站的教程开始：https://www.nextflow.io/docs/latest/getstarted.html（我是安装nextflow的人）。

我复制了这个脚本：

#!/usr/bin/env nextflow

params.str = 'Hello world!'

process splitLetters {

    output:
    file 'chunk_*' into letters

    """
    printf '${params.str}' | split -b 6 - chunk_
    """
}


process convertToUpper {

    input:
    file x from letters.flatten()

    output:
    stdout result

    """
    cat $x | tr '[a-z]' '[A-Z]'
    """
}

result.view { it.trim() }

Run Code Online (Sandbox Code Playgroud)

但是当我运行它时（nextflow runtutorial.nf），在终端中我有这个：

N E X T F L O W  ~  version 22.03.1-edge
Launching `tutorial.nf` [intergalactic_waddington] DSL2 - revision: be42f295f4
No such variable: result

 -- …

Run Code Online (Sandbox Code Playgroud)

nextflow

Lee*_*ouh

lucky-day

5
推荐指数

1
解决办法

3117
查看次数

如何在 Nextflow 中调用脚本中创建的变量？

我有一个 nextflow 脚本，它从文本文件创建一个变量，我需要将该变量的值传递给命令行命令（这是一个 bioconda 包）。这两个过程发生在“脚本”部分内。我尝试使用“$”符号调用变量，但没有任何结果，我认为因为在 nextflow 脚本的脚本部分中使用该符号是为了调用输入部分中定义的变量。

为了让自己更清楚，这里是我想要实现的目标的代码示例：

params.gz_file = '/path/to/file.gz'
params.fa_file = '/path/to/file.fa'
params.output_dir = '/path/to/outdir'

input_file = file(params.gz_file)
fasta_file = file(params.fa_file)

process foo {
    //publishDir "${params.output_dir}", mode: 'copy',

    input:
    path file from input_file
    path fasta from fasta_file

    output:
    file ("*.html")

    script:
    """
    echo 123 > number.txt
    parameter=`cat number.txt`
    create_report $file $fasta --flanking $parameter 
    """
}

Run Code Online (Sandbox Code Playgroud)

通过这样做，我收到的错误是：

Error executing process > 'foo'
Caused by:
  Unknown variable 'parameter' -- Make sure it is not misspelt and defined somewhere in the …

Run Code Online (Sandbox Code Playgroud)

bash bioinformatics nextflow

Cri*_*uñí

lucky-day

4
推荐指数

1
解决办法

5374
查看次数

Nextflow 名称冲突

我有名称相同但位于不同文件夹中的文件。Nextflow 将这些文件暂存在同一工作目录中，从而导致名称冲突。我的问题是如何在不重命名文件的情况下处理这个问题。例子：

# Example data
mkdir folder1 folder2
echo 1 > folder1/file.txt
echo 2 > folder2/file.txt

# We read from samplesheet
$ cat samplesheet.csv
sample,file
sample1,/home/atpoint/foo/folder1/file.txt
sample1,/home/atpoint/foo/folder2/file.txt

# Nextflow main.nf
#! /usr/bin/env nextflow

nextflow.enable.dsl=2

// Read samplesheet and group files by sample (first column)
samplesheet = Channel
    .fromPath(params.samplesheet)
    .splitCsv(header:true)
    .map {
            sample = it['sample']
            file   = it['file']
            tuple(sample, file)
}
        
ch_samplesheet = samplesheet.groupTuple(by:0)

// That creates a tuple like:
// [sample1, [/home/atpoint/foo/folder1/file.txt, /home/atpoint/foo/folder2/file.txt]]

// Dummy process that stages …

Run Code Online (Sandbox Code Playgroud)

nextflow

ATp*_*int

lucky-day

4
推荐指数

1
解决办法

1453
查看次数

使用命令行参数覆盖 Nextflow 参数

鉴于以下情况nextflow.config：

google {
  project = "cool-project"
  region = "europe-west4"
            
  lifeSciences {
    bootDiskSize = "200 GB"
    debug = true
    preemptible = true
  }
}

Run Code Online (Sandbox Code Playgroud)

是否可以使用命令行参数覆盖其中一项或多项设置。例如，如果我想指定不应该使用抢占式机器，我可以执行以下操作：

nextflow run main.nf -c nextflow.config --google.lifeSciences.preemptible false

Run Code Online (Sandbox Code Playgroud)

？

nextflow

r0f*_*0f1

2021 02-25

3
推荐指数

1
解决办法

1596
查看次数

如何在nextflow中运行for循环

这是我在 nextflow 中运行 for 循环时遇到的问题，我的脚本似乎不起作用。这是我的对文件，总共 3 对，我希望这三对中的每对在一个进程中执行一次。这些对文件存储在“/data/mPCR/3samples_20220525/”路径中。

V350092589_L01_86_1.fq.gz  
V350092589_L01_86_2.fq.gz

V350092589_L01_85_1.fq.gz          
V350092589_L01_85_2.fq.gz

V350092589_L01_84_1.fq.gz            
V350092589_L01_84_2.fq.gz

Run Code Online (Sandbox Code Playgroud)

这是我的脚本

params.fq = "/data/mPCR/3samples_20220525/" 
   
process soapnuke{
        tag{"soapnuk"}
    
        input:
            val fq from params.fq
    
        output:
            path '*.clean1.fastq.gz' into trim_primer1
            path '*.clean2.fastq.gz' into trim_primer2
    
    script:
        """
        sample1=\$(basename \$(readlink 1.fq.gz) _1.fq.gz)
        sample2=\$(basename \$(readlink 2.fq.gz) _2.fq.gz)
    
        SOAPnuke filter -1 \$fq*1.fq.gz -2 \$fq*2.fq.gz -o ./ -C \${sample1}.clean.fastq.gz -D \${sample2}.clean.fastq.gz
        """
    }

Run Code Online (Sandbox Code Playgroud)

我应该怎么做才能运行这个过程中的所有对？任何帮助，将不胜感激。

for-loop nextflow

Daf*_*ffy

lucky-day

3
推荐指数

1
解决办法

2177
查看次数

Nextflow：如何将输出（多个文件）从publishdir传递到下一个进程？

我有一个进程生成两个我感兴趣的文件，hitsort.cls 和 contigs.fasta。我使用publishdir输出这些：

process RUN_RE {
    publishDir "$baseDir/RE_output", mode: 'copy'
  
    input:
    file 'interleaved.fq'

    output:
    file "${params.RE_run}/seqclust/clustering/hitsort.cls"
    file "${params.RE_run}/contigs.fasta"

    script:
    """
    some_code

    """

  }

Run Code Online (Sandbox Code Playgroud)

现在，我需要这两个文件作为另一个进程的输入，但我不知道该怎么做。

我尝试过调用这个过程

NEXT_PROCESS(params.hitsort, params.contigs)

Run Code Online (Sandbox Code Playgroud)

同时将输入指定为：

process NEXT_PROCESS {
  
    input:
    path hitsort
    path contigs

Run Code Online (Sandbox Code Playgroud)

但它不起作用，因为只使用基本名称而不是完整路径。基本上我想要的是等待 RUN_RE 完成，然后使用它输出的两个文件进行下一个进程。

process nextflow publishdir

Per*_*ika

lucky-day

2
推荐指数

1
解决办法

5461
查看次数

nextflow：找不到命令

我正在尝试使用 nextflow 来运行管道，但是当我运行它时，它一直显示nextflow: command not found. 我已经安装了 nextflow （我按照本教程https://www.nextflow.io/docs/latest/getstarted.html）

我需要在配置文件中添加 nextflow 的路径吗？

linux nextflow

She*_*gne

2021 11-25

1
推荐指数

1
解决办法

3334
查看次数

标签统计

nextflow ×10

linux ×2

anaconda ×1

bash ×1

bioinformatics ×1

conda ×1

directed-acyclic-graphs ×1

for-loop ×1

groovy ×1

process ×1

publishdir ×1

标签 统计

标签统计