Spark py4j.protocol.Py4JJavaError:调用o718.showString时发生错误

sen*_*ali 6 python pyspark spark-dataframe

我是新来的火花.我在Spark(pySPark)上运行python API以在cloudera集群上构建模型.

我创建了一个批处理文件来提交作业.作业成功运行,除了显示数据帧结果的最后一步 - ' step3_final.show()',它会引发错误.

PFB我在日志中收到的错误消息

step3_final.show(6)
  File "/opt/cloudera/parcels/CDH-5.7.1-1.cdh5.7.1.p1876.1944/lib/spark/python/lib/pyspark.zip/pyspark/sql/dataframe.py", line 257, in show
  File "/opt/cloudera/parcels/CDH-5.7.1-1.cdh5.7.1.p1876.1944/lib/spark/python/lib/py4j-0.9-src.zip/py4j/java_gateway.py", line 813, in __call__
  File "/opt/cloudera/parcels/CDH-5.7.1-1.cdh5.7.1.p1876.1944/lib/spark/python/lib/pyspark.zip/pyspark/sql/utils.py", line 45, in deco
  File "/opt/cloudera/parcels/CDH-5.7.1-1.cdh5.7.1.p1876.1944/lib/spark/python/lib/py4j-0.9-src.zip/py4j/protocol.py", line 308, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling o718.showString.
Run Code Online (Sandbox Code Playgroud)

任何人都可以帮我理解错误信息.提前致谢.

Ank*_*deo 0

这可能是因为编码问题只需添加此

# -*- coding: utf-8 -*- 
Run Code Online (Sandbox Code Playgroud)

如果您使用 Spark 运行解决方案,请在代码顶部的 Python 脚本中提交 else

import sys
# sys.setdefaultencoding() does not exist, here!
reload(sys)  # Reload does the trick!
sys.setdefaultencoding('UTF8')
Run Code Online (Sandbox Code Playgroud)

请告诉我这是否有帮助