如何从通过SSHExecuteOperator推送的Airflow XCom中检索值

Ale*_*and 4 ssh airflow

我有以下DAG和两个SSHExecuteOperator任务.第一个任务执行返回参数的存储过程.第二个任务需要此参数作为输入.

请问如何从task1中推送的XCom中提取值,以便在task2中使用它?

from airflow import DAG
from datetime import datetime, timedelta
from airflow.contrib.hooks.ssh_hook import SSHHook
from airflow.contrib.operators.ssh_execute_operator import SSHExecuteOperator
from airflow.models import Variable

default_args = {
  'owner': 'airflow',
  'depends_on_past': False,
  'start_date': datetime.now(),
  'email': ['my@email.com'],
  'email_on_failure': True,
  'retries': 0
}

#server must be changed to point to the correct environment, to do so update DataQualitySSHHook variable in Airflow admin
DataQualitySSHHook = Variable.get('DataQualitySSHHook')
print('Connecting to: ' + DataQualitySSHHook)
sshHookEtl = SSHHook(conn_id=DataQualitySSHHook)
sshHookEtl.no_host_key_check = True 

#create dag
dag = DAG(
  'ed_data_quality_test-v0.0.3', #update version whenever you change something
  default_args=default_args,
  schedule_interval="0 0 * * *",
  dagrun_timeout=timedelta(hours=24),
  max_active_runs=1)

#create tasks
task1 = SSHExecuteOperator(
  task_id='run_remote_sp_audit_batch_register',
  bash_command="bash /opt/scripts/data_quality/EXEC_SP_AUDIT_BATCH.sh 'ED_DATA_QUALITY_MANUAL' 'REGISTER' '1900-01-01 00:00:00.000000' '2999-12-31 00:00:00.000000' ", #keep the space at the end
  ssh_hook=sshHookEtl,
  xcom_push=True,
  retries=0,
  dag=dag)

task2 = SSHExecuteOperator(
  task_id='run_remote_sp_audit_module_session_start',
  bash_command="echo {{ ti.xcom_pull(task_ids='run_remote_sp_audit_batch_register') }}",
  ssh_hook=sshHookEtl,
  retries=0,
  dag=dag)

#create dependencies
task1.set_downstream(task2)
Run Code Online (Sandbox Code Playgroud)

Ale*_*and 8

所以我找到的解决方案是当task1执行shell脚本时,你必须确保你想要被XCom变量捕获的参数是脚本打印的最后一件事(使用echo).

然后我能够使用以下代码片段检索XCom变量值:

{{ task_instance.xcom_pull(task_ids='run_remote_sp_audit_batch_register') }}