步骤函数中的aws胶水作业依赖性

Ash*_*Ash 2 amazon-web-services aws-step-functions aws-glue

我创建了 2 个胶水作业(gluejob1、gluejob2)。

我想创建一个依赖项,因为gluejob2 应该只在gluejob1 完成后运行。

为了编排这个,我创建了一个具有以下定义的阶跃函数:

 {
  "gluejob1": {
    "Type": "Task",
    "Resource": "gluejob1.Arn",
    "Comment": "Glue job1.",
    "Next": "gluejob2"
  },

  "gluejob2": {
    "Type": "Task",
    "Resource": "gluejob2.Arn",
    "Comment": "TGlue job2.",
    "Next": "Gluejob2 Finished Loading"
  },
  "Gluejob2 Finished Loading": {
    "Type": "Pass",
    "Result": "",
    "End": true
  }
}
Run Code Online (Sandbox Code Playgroud)

当我执行这个 step 函数时,状态函数在它触发 Gluejob1并继续触发gluejob2的那一刻称它为成功

我想知道是否有可能只有在gluejob1 完成后才运行gluejob2。

Yur*_*ruk 8

您可以从 StepFunction 同步调用 Glue 作业,以便它等待作业完成:

{
  "StartAt": "gluejob1",
  "States": {
    "gluejob1": {
      "Type": "Task",
      "Resource": "arn:aws:states:::glue:startJobRun.sync",
      "Parameters": {
        "JobName.$": "ETLJobName1"
      },
      "Next": "gluejob2"
    },
    "gluejob2": {
      "Type": "Task",
      "Resource": "arn:aws:states:::glue:startJobRun.sync",
      "Parameters": {
        "JobName.$": "ETLJobName2"
      },
      "Next": "Gluejob2 Finished Loading"
    },
    "Gluejob2 Finished Loading": {
      "Type": "Pass",
      "Result": "",
      "End": true
    }
}
Run Code Online (Sandbox Code Playgroud)

  • 您可能需要确保步骤函数的角色对 Glue 作业具有所有必需的 IAM 权限,即glue:StartJobRun、glue:GetJobRun、glue:GetJobRuns 和glue:BatchStopJobRun https://docs.aws.amazon.com/步骤功能/最新/dg/glue-iam.html (2认同)