我正在尝试使用 Python Operator 连接到 Airflow 中的 Google Sheets,如下所示
import pandas as pd
import pygsheets
from google.oauth2 import service_account
from airflow.operators.python import PythonOperator
def estblsh_conn_to_gs():
creds = service_account.Credentials.from_service_account_file(
'service_account_json_file',
scopes=('google_api_spreadsheets_auth_link', 'google_api_gdrive_auth_link'),
subject='client_mail'
)
pg = pygsheets.authorize(custom_credentials=creds)
return pg
def get_data_from_spreadsheet(spreadsheet_link, worksheet_title):
pg = establish_conn_to_gs()
doc = pg.open_by_url('spreadsheet_link')
data = doc.worksheet_by_title('worksheet_name').get_all_values(include_tailing_empty_rows=False)
return data
get_data_from_gs = PythonOperator(
task_id = 'get_data_from_gs',
python_callable = get_data_from_spreadsheet(link, title)
)
Run Code Online (Sandbox Code Playgroud)
这工作得很好,但也许还有其他选择可以做到同样的事情?我找到了 Google Sheets Operator,但当前的技术文档不好(
感谢帮助!