Pyspark 得到 TypeError: can't pickle _abc_data objects

Fra*_*vas 3 python pyspark

我正在尝试使用 pyspark 从腌制模型生成预测,我使用以下命令获取模型

model = deserialize_python_object(filename)

deserialize_python_object(filename)定义为:

import pickle
def deserialize_python_object(filename):
try:
    with open(filename, ‘rb’) as f:
        obj = pickle.load(f)
except:
    obj = None
return obj
Run Code Online (Sandbox Code Playgroud)

错误日志如下所示:

File “/Users/gmg/anaconda3/envs/env/lib**strong text**/python3.7/site-packages/pyspark/sql/udf.py”, line 189, in wrapper
    return self(*args)
  File “/Users/gmg/anaconda3/envs/env/lib/python3.7/site-packages/pyspark/sql/udf.py”, line 167, in __call__
    judf = self._judf
  File “/Users/gmg/anaconda3/envs/env/lib/python3.7/site-packages/pyspark/sql/udf.py”, line 151, in _judf
    self._judf_placeholder = self._create_judf()
  File “/Users/gmg/anaconda3/envs/env/lib/python3.7/site-packages/pyspark/sql/udf.py”, line 160, in _create_judf
    wrapped_func = _wrap_function(sc, self.func, self.returnType)
  File “/Users/gmg/anaconda3/envs/env/lib/python3.7/site-packages/pyspark/sql/udf.py”, line 35, in _wrap_function
    pickled_command, broadcast_vars, env, includes = _prepare_for_python_RDD(sc, command)
  File “/Users/gmg/anaconda3/envs/env/lib/python3.7/site-packages/pyspark/rdd.py”, line 2420, in _prepare_for_python_RDD
    pickled_command = ser.dumps(command)
  File “/Users/gmg/anaconda3/envs/env/lib/python3.7/site-packages/pyspark/serializers.py”, line 600, in dumps
    raise pickle.PicklingError(msg)
_pickle.PicklingError: Could not serialize object: TypeError: can’t pickle _abc_data objects
Run Code Online (Sandbox Code Playgroud)

Gon*_*cia 7

似乎您遇到了与本期相同的问题:https : //github.com/cloudpipe/cloudpickle/issues/180

发生的事情是 pyspark 的 cloudpickle 库对于 python 3.7 来说已经过时了,你现在应该用这个精心制作的补丁修复这个问题,直到 pyspark 更新那个模块

尝试使用此解决方法:

  1. 安装 cloudpickle pip install cloudpickle

  2. 将此添加到您的代码中:

import cloudpickle
import pyspark.serializers
pyspark.serializers.cloudpickle = cloudpickle
Run Code Online (Sandbox Code Playgroud)

猴子补丁信用https://github.com/cloudpipe/cloudpickle/issues/305