小编Jef*_*tos的帖子

在云 Dataproc 中的 Pyspark 作业上使用 DeltaTable.forPath 时出错

我正在 Dataproc 集群上执行一些 pyspark 作业。直到昨天一切都很顺利。然而,今天我在使用命令 DeltaTable.forPath(sparkSession, path) 读取增量表并更新它时开始出现此错误。

Traceback (most recent call last):
  File "/tmp/job-0eb2543e/cohort_ka.py", line 146, in <module>
    main()
  File "/tmp/job-0eb2543e/cohort_ka.py", line 128, in main
    persisted = DeltaTable.forPath(spark, destination)
  File "/opt/conda/default/lib/python3.8/site-packages/delta/tables.py", line 387, in forPath
    jdt = jvm.io.delta.tables.DeltaTable.forPath(jsparkSession, path, hadoopConf)
  File "/usr/lib/spark/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py", line 1304, in __call__
  File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/sql/utils.py", line 111, in deco
  File "/usr/lib/spark/python/lib/py4j-0.10.9-src.zip/py4j/protocol.py", line 330, in get_return_value
py4j.protocol.Py4JError: An error occurred while calling z:io.delta.tables.DeltaTable.forPath. Trace:
py4j.Py4JException: Method forPath([class org.apache.spark.sql.SparkSession, class java.lang.String, class java.util.HashMap]) does not exist
    at py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:318) …
Run Code Online (Sandbox Code Playgroud)

google-cloud-dataproc delta-lake

3
推荐指数
1
解决办法
1359
查看次数

标签 统计

delta-lake ×1

google-cloud-dataproc ×1