标签: pipeline

将规则的可选输入文件全部放入 Snakemake 中

在我的 Snakemake 项目中，我有一个 config.yaml 文件，它允许用户运行或不运行管道的某些步骤，例如：

DEG : 
   exec : True

Run Code Online (Sandbox Code Playgroud)

因此，在 Snakefile 中，我包含了与 DEG 相关的规则：

if config["DEG"]["exec"]:
   include: "rules/classic_mapping.smk"
   include: "rules/counts.smk"
   include: "rules/run_DESeq2.smk"

Run Code Online (Sandbox Code Playgroud)

问题是，现在我想在“all”规则中动态指定输出文件，以便Snakemake知道根据用户输入的参数生成哪些文件。例如，我想按如下方式进行：

rule all:   
   input:
       if config["DEG"]["exec"]:
          "DEG/DEG.txt"
       if config["DTU"]["exec"]:
          "DTU/DTU.txt"

Run Code Online (Sandbox Code Playgroud)

但它不起作用：如果在规则定义中，则 Unexpected 关键字的第 58 行出现 SyntaxError (Snakefile，第 58 行)

我需要外部观点来找到替代方案，因为 Snakemake 不应该以这种方式工作

提前致谢

python workflow pipeline snakemake

Vin*_*bot

lucky-day

2
推荐指数

1
解决办法

1552
查看次数

这是我关于 pipelines.yml 文件的问题。首先，我使用 Elasticsearch 6.6 和 Logstash 6.2.2。两者都安装在我自己的 Google Cloud 帐户的虚拟机中（不是 ELK 提供的，而是安装在我自己的 GCP 帐户中的托管中）。我有 3 个文件夹，其中包含来自 IoT 设备的日志文件，只想将它们同时注入到 3 个相应的索引中，因此我在logstash/config 路径中创建了一个 pipelines.yml 文件，其中包含以下内容：

-pipeline.id: pipeline1
 path.config: "/config/p1/logstash-learning.conf"
 pipeline.workers: 1
-pipeline.id: pipeline2
 path.config: "/config/p2/logstash-groundtruth.conf"
 pipeline.workers: 1
-pipeline.id: pipeline3
 path.config: "/config/p3/logstash-fieldtest.conf"
 pipeline.workers: 1

Run Code Online (Sandbox Code Playgroud)

因此，当我使用命令 ./bin/logstash 运行logstash（使用此命令我们告诉 Logstash 加载默认文件 pipelines.yml，对吗？）时，我收到下面的错误消息，但我无法弄清楚为什么会发生这种情况。请注意，pipelines.yml 具有完全的可访问权限。

jruby: warning: unknown property jruby.regexp.interruptible
Sending Logstash's logs to /home/evangelos/logstash-6.2.2/logs which is now configured via log4j2.properties
[2019-12-17T16:36:43,877][INFO ][logstash.modules.scaffold] Initializing module {:module_name=>"netflow", :directory=>"/home/evangelos/logstash-6.2.2/modules/netflow/configuration"}
[2019-12-17T16:36:43,933][INFO ][logstash.modules.scaffold] Initializing module {:module_name=>"fb_apache", :directory=>"/home/evangelos/logstash-6.2.2/modules/fb_apache/configuration"}
ERROR: Failed to …

Run Code Online (Sandbox Code Playgroud)

pipeline logstash

Joh*_*kis

lucky-day

2
推荐指数

1
解决办法

8085
查看次数

由计划触发的 GitLab 手动作业

我的 GitLab 管道有一点问题。
我想使用计划规则运行手动作业，或者找到一种方法在我的作业中运行计划管道而无需重写管道。

正如您在示例中看到的，我有 2 个第一份工作标记为工作。一种是手动的，一种是计划的。我的问题是，如果我运行计划的工作流程，AC 测试将不会启动，如果我尝试按计划的规则运行 FirstJob ，它不会因为when: manual部分而启动。

这是我的例子：

stages:
    - firstjob
    - test
    - build
    - deploy

FirstJob:
    stage: firstjob
    script:
        - echo "Hello Peoples!"
        - sleep 1
    when: manual
    allow_failure: false

FirstJobSchedule:
    stage: firstjob
    script:
        - echo "Hello Scheduled Peoples!"
        - sleep 1
    only: 
        - schedule
    allow_failure: false

AC-test:
    needs: [FirstJob]
    stage: test
    script:
        - echo "AC Test is running"
        - sleep 10

ProdJobBuild:
    stage: build
    needs: [AC-test]
    script:
        - …

Run Code Online (Sandbox Code Playgroud)

schedule pipeline manual gitlab gitlab-ci

zee*_*erk

lucky-day

2
推荐指数

1
解决办法

5000
查看次数

不得与 sudo 一起运行

您好，我是 github actions 的新手，我正在尝试使用 Github action 创建 CICD pipline。我正在使用数字海洋水滴作为我的服务器，并且我正在尝试创建一个跑步者，如 github->settings->actions 中所述

当我写下以下命令时 ./config.sh --url https://github.com/basobaasnepal/BasobaasWeb --token DFGFSDF234sf3fg45hd

我得到了这个：不能使用 sudo 运行

我尝试将 root 用户更改为非 root 用户，但没有成功。我也尝试export {AGENT_ALLOW_RUNASROOT="1"}过

pipeline digital-ocean devops github-actions

Riw*_*ise

lucky-day

2
推荐指数

1
解决办法

5320
查看次数

当传递参数与其他参数的特定组合时，“无法解析参数集”

我编写了一个使用四个参数和四个参数集的函数。第一个参数$Path未分配给集合，因此属于所有集合。它也是强制性的，并且是唯一可以从管道传递的参数。但是，当我在管道末尾调用函数时使用其他三个参数的某些组合（所有这些参数都属于四组的某些组合）执行此操作时，我收到一条错误，指示该组不明确。

这是我的功能：

function Foo-Bar {
    [CmdletBinding(DefaultParameterSetName = 'A')]

    param (
        [Parameter(Mandatory = $true,
            ValueFromPipeline = $true)]
        [ValidateNotNullOrEmpty()]
        [string[]] $Path,

        [Parameter(ParameterSetName = 'A')]
        [Parameter(ParameterSetName = 'A-Secure')]
        [Switch] $OutputToConsole,

        [Parameter(Mandatory = $true,
            ParameterSetName = 'B')]
        [Parameter(Mandatory = $true,
            ParameterSetName = 'B-Secure')]
        [int] $OutputMode,

        [Parameter(Mandatory = $true,
            ParameterSetName = 'A-Secure')]
        [Parameter(Mandatory = $true,
            ParameterSetName = 'B-Secure')]
        [Switch] $Login
    )

    $PSCmdlet.ParameterSetName
}

Run Code Online (Sandbox Code Playgroud)

所有可能的参数组合如下：

PS C:\> Foo-Bar -Path "C:\Test.jpg"
A
PS C:\> Foo-Bar -Path "C:\Test.jpg" -OutputToConsole
A
PS C:\> Foo-Bar -Path "C:\Test.jpg" -OutputToConsole …

Run Code Online (Sandbox Code Playgroud)

powershell pipeline parameter-sets

Ale*_*lex

lucky-day

2
推荐指数

1
解决办法

2339
查看次数

如何直接通过管道传输到 Copy-Item 而不是在 ForEach-Object 内

由于使用 -Recurse 标志时，Get-ChildItem 的 -Exclude 参数不会对子文件夹进行过滤，因此请参阅使用 get-childitem-exclude-parameter-in-powershell 中的其他无法排除目录的内容

但 -Exclude 参数可用于过滤掉根级别的文件夹

我写了自己的递归函数：

function Get-ChildItem-Recurse() {
    [cmdletbinding()]
    Param(
      [parameter(ValueFromPipelineByPropertyName = $true)]
      [alias('FullName')]
      [string[]] $Path,
      [string] $Filter,
      [string[]] $Exclude,
      [string[]] $Include,
      [switch] $Recurse = $true,
      [switch] $File = $false
    )

    Process {
      ForEach ( $P in $Path ) {
        Get-ChildItem -Path $P -Filter $Filter -Include $Include -Exclude $Exclude | ForEach-Object {
        if ( -not ( $File -and $_.PSIsContainer ) ) {
          $_
        }
        if ( $Recurse -and $_.PSIsContainer ) {
          $_ …

Run Code Online (Sandbox Code Playgroud)

powershell pipeline copy-item foreach-object

Bar*_*tel

2021 12-08

2
推荐指数

1
解决办法

915
查看次数

如何仅在某些值上在管道内使用 StandardScaler？

我有个问题。我想使用StandardScaler()，但我的数据集包含某些值和其他不应缩放的OneHotEncoding值。但如果我正在运行，所有值都会缩放。那么是否可以选择仅对管道内的某些值运行此方法？StandardScaler()

我发现了这个问题：使用以下代码对分类变量进行 One-Hot-Encode 并同时缩放连续变量

columns = ['rank'] columns_to_scale = ['gre', 'gpa'] scaler = StandardScaler() ohe = OneHotEncoder(sparse=False) # Concatenate (Column-Bind) Processed Columns Back Together processed_data = np.concatenate([scaled_columns, encoded_columns], axis=1)
Run Code Online (Sandbox Code Playgroud)
那么是否有一个选项可以仅在某些值上运行StandardScaler()内部 a pipeline，而其他值应该合并到缩放值中？因此管道应该只对值使用StandardScaler 'xy', 'xyz'。

标准定标器类

from sklearn.base import BaseEstimator, TransformerMixin class StandardScaler_with_certain_features(BaseEstimator, TransformerMixin): def __init__(self, columns_to_scale): scaler = StandardScaler() def fit(self, X, y = None): scaler.fit(X_train) # only std.fit on train set X_train_nor = scaler.transform(X_train.values) def transform(self, X, …
Run Code Online (Sandbox Code Playgroud)

python pipeline machine-learning normalization scikit-learn

Tes*_*est

2022 07-02

2
推荐指数

1
解决办法

2002
查看次数

如何在 Javascript 中将两个 ReadableStreams 通过管道传输到一个 WritableStream 中？

我有两个 ReadableStream，我想将它们通过管道传输到一个 WritableStream，其中通过 ReadableStream 的任何数据都会立即直接进入 WritableStream。

我可以做相反的事情，通过使用ReadableStream.prototype.tee()将一个 ReadableStream 一分为二，但我不知道如何将两个合并为一个。

const textarea = document.querySelector("textarea"); // This is a ReadableStream which says "Mom! " every 1 second. const momReadableStream = new ReadableStream({ start: controller => { const sayMom = () => controller.enqueue("Mom! "); setInterval(sayMom, 1000); }}); // This is a ReadableStream which says "Lois! " every 0.7 seconds. const loisReadableStream = new ReadableStream({ start: controller => { const sayLois = () => controller.enqueue("Lois! "); setInterval(sayLois, 700); }}); // …
Run Code Online (Sandbox Code Playgroud)

javascript pipeline tee whatwg-streams-api

asd*_*159

2023 04-15

2
推荐指数

1
解决办法

560
查看次数

函数参数位置突然表现得很奇怪

代码：

这个例子说明了我的意思：

Function test{ param( [parameter(Position = 0, ValueFromPipeline)] [string]$Param0, [parameter(Position = 1)] [ArgumentCompletions("Param1_Opt1", "Param1_Opt2")] [Array]$Param1 = ('Param1_Opt3', 'Param1_Opt4'), [switch]$NoHeader ) "This is $Param0" "Header :$Param1" }
Run Code Online (Sandbox Code Playgroud)
我的问题：

很长一段时间以来，我一直依赖所有函数中的参数位置，今天在编写函数时，突然它停止了我使用它们的方式。上面的test函数演示了这个问题。

如果参数具有参数Position = 0属性并且也具有该ValueFromPipeline 属性。当它用于管道时。具有Position属性的下一个参数占据其位置。这也意味着接下来的参数ArgumenCompletions，例如"Param1_Opt1"/ "Param1_Opt2" get 建议。

但我根本没有得到这种行为。

Test "This is For Parameter Zero" "This is For Parameter One" ---- Ouput ----- This is For Parameter Zero This is For Parameter One
Run Code Online (Sandbox Code Playgroud)
上面的代码按预期工作，第一个字符串正确分配给Param0，第二个字符串正确分配给Param1，更多Param1参数建议有效，但以下内容失败并出现错误，并且管道字符串被分配给Param1 …

powershell pipeline parameter-passing positional-parameter

Gar*_*n82

2023 08-08

2
推荐指数

1
解决办法

50
查看次数

配合使用带有管道和GridSearch的cross_val_score嵌套的交叉验证

我正在使用scikit，正在尝试调整XGBoost。我尝试使用嵌套的交叉验证，通过管道对训练折叠进行重新缩放（以避免数据泄漏和过度拟合），并与GridSearchCV并行进行参数调整，并与cross_val_score并行获得roc_auc得分。

from imblearn.pipeline import Pipeline from sklearn.model_selection import RepeatedKFold from sklearn.model_selection import GridSearchCV from sklearn.model_selection import cross_val_score from xgboost import XGBClassifier std_scaling = StandardScaler() algo = XGBClassifier() steps = [('std_scaling', StandardScaler()), ('algo', XGBClassifier())] pipeline = Pipeline(steps) parameters = {'algo__min_child_weight': [1, 2], 'algo__subsample': [0.6, 0.9], 'algo__max_depth': [4, 6], 'algo__gamma': [0.1, 0.2], 'algo__learning_rate': [0.05, 0.5, 0.3]} cv1 = RepeatedKFold(n_splits=2, n_repeats = 5, random_state = 15) clf_auc = GridSearchCV(pipeline, cv = cv1, param_grid = parameters, scoring = 'roc_auc', n_jobs=-1, return_train_score=False) cv1 = RepeatedKFold(n_splits=2, …
Run Code Online (Sandbox Code Playgroud)

pipeline nested scikit-learn cross-validation grid-search

ina*_*tos

2018 09-03

1
推荐指数

1
解决办法

1105
查看次数

标签统计

pipeline ×10

powershell ×3

python ×2

scikit-learn ×2

copy-item ×1

cross-validation ×1

devops ×1

digital-ocean ×1

foreach-object ×1

github-actions ×1

gitlab ×1

gitlab-ci ×1

grid-search ×1

javascript ×1

logstash ×1

machine-learning ×1

manual ×1

nested ×1

normalization ×1

parameter-passing ×1

parameter-sets ×1

positional-parameter ×1

schedule ×1

snakemake ×1

tee ×1

whatwg-streams-api ×1

workflow ×1

代码：

我的问题：

标签 统计

标签统计