Vie*_*nto 1 python post google-cloud-dataflow
我想使用 PYTHON 执行 Google 数据流模板。实际上,我一直在使用Dataflow REST API或Cloud Functions集成执行数据流模板。这是我在 Postman 中执行的 Dataflow 模板:
网址: https://dataflow.googleapis.com/v1b3/projects/{{my-project-id}}/templates:launch?gcsPath=gs://{{my-cloud-storage-bucket}}/temp/cloud-dataprep-template
{
"jobName": "test-datfalow-job",
"parameters": {
"inputLocations" : "{\"location1\":\"gs://{{my-cloud-storage-bucket}}/my-folder/**/*\"}",
"outputLocations": "{\"location1\":\"gs://{{my-cloud-storage-bucket}}/my-output/output.csv\"}"
},
"environment": {
"tempLocation": "gs://{{my-cloud-storage-bucket}}/tmp",
"zone": "us-central1-f"
}
}
Run Code Online (Sandbox Code Playgroud)
我不知道是否有机会使用 google-api-python-client 或者我必须使用 python 的 requests.post 和 Google Cloud Authentication 执行此 HTTP POST
您可以使用Python的Dataflow API 客户端库中的模板启动方法来执行此操作,如下所示:
import googleapiclient.discovery
from oauth2client.client import GoogleCredentials
project = PROJECT_ID
location = LOCATION
credentials = GoogleCredentials.get_application_default()
dataflow = googleapiclient.discovery.build('dataflow', 'v1b3', credentials=credentials)
result = dataflow.projects().templates().launch(
projectId=project,
body={
"environment": {
"zone": "us-central1-f",
"tempLocation": "gs://{{my-cloud-storage-bucket}}/tmp"
},
"parameters": {
"inputLocations" : "{\"location1\":\"gs://{{my-cloud-storage-bucket}}/my-folder/**/*\"}",
"outputLocations": "{\"location1\":\"gs://{{my-cloud-storage-bucket}}/my-output/output.csv\"}"
},
"jobName": SOME_NAME
},
gcsPath = PATH_TO_TEMPLATE
).execute()
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
2287 次 |
| 最近记录: |