Airflow Kubernetes Pods 异常错误 - (404) 原因:未找到

Mad*_*lid 7 kubernetes airflow airflow-scheduler kubernetes-pod

我正在寻求支持来调试此 Airflow KubernetesPodOperator 问题。当 Airflow 任务执行时,我们随机得到这个错误。作业即将完成,在作业执行结束时,会pods not found excception抛出异常(实际上,Airflow Task 是一个 Python 作业,已经完成了它的工作),但由于此异常,Airflow 将此作业标记为failed)。


ERROR - (404)
Reason: Not Found
HTTP response headers: HTTPHeaderDict({'Audit-Id': 'd4df122xx-bxcb-42f2-8c9e-768e9bbb00x9', 'Cache-Control': 'no-cache, private', 'Content-Type': 'application/json', 'X-Kubernetes-Pf-Flowschema-Uid': 'xxxx-xxx-xxx-xxxxxxxx', 'X-Kubernetes-Pf-Prioritylevel-Uid': 'xxxx-xxx-xxx-xxxxxxxx', 'Date': 'Sat, 17 Jul 2021 02:10:07 GMT', 'Content-Length': '258'})
HTTP response body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"pods \"xxxx.6cb9f2cc66d0455c882cb5bae007ae84\" not found","reason":"NotFound","details":{"name":"xxx.6cb9f2cc66d0455c882cb5bae007ae84","kind":"pods"},"code":404}

Run Code Online (Sandbox Code Playgroud)

我们确实在 Elasticsearch Index 中保存了详细的日志,并且在那个特殊时间没有日志来调查为什么 Airflow 找不到此正在运行的作业的这些 Pod。

Airflow Kubernetes Expert 的人员能否指导正确的方向来解决和调查此问题?