小编pel*_*tor的帖子

即使安装了Numpy,使用MLlib时也会出现NumPy异常

这是我正在尝试执行的代码:

from pyspark.mllib.recommendation import ALS
iterations=5
lambdaALS=0.1
seed=5L
rank=8
model=ALS.train(trainingRDD,rank,iterations, lambda_=lambdaALS, seed=seed)
Run Code Online (Sandbox Code Playgroud)

我跑的时候 model=ALS.train(trainingRDD,rank,iterations, lambda_=lambdaALS, seed=seed)依赖于numpy命令时,Spark使用的Py4Java库会抛出以下消息:

Py4JJavaError: An error occurred while calling o587.trainALSModel.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 67.0 failed 4 times, most recent failure: Lost task 0.3 in stage 67.0 (TID 195, 192.168.161.55): org.apache.spark.api.python.PythonException: Traceback (most recent call last):
  File "/home/platform/spark/python/lib/pyspark.zip/pyspark/worker.py", line 98, in main
    command = pickleSer._read_with_length(infile)
  File "/home/platform/spark/python/lib/pyspark.zip/pyspark/serializers.py", line 164, in _read_with_length
    return self.loads(obj)
  File "/home/platform/spark/python/lib/pyspark.zip/pyspark/serializers.py", line 421, in …
Run Code Online (Sandbox Code Playgroud)

python numpy apache-spark pyspark apache-spark-mllib

6
推荐指数
1
解决办法
3251
查看次数