小编Nas*_*ood的帖子

Pyspark ML - 如何保存管道和 RandomForestClassificationModel

我无法保存使用 python/spark 的 ml 包生成的随机森林模型。

>>> rf = RandomForestClassifier(labelCol="label", featuresCol="features")
>>> pipeline = Pipeline(stages=early_stages + [rf])
>>> model = pipeline.fit(trainingData)
>>> model.save("fittedpipeline")

Run Code Online (Sandbox Code Playgroud)

回溯（最近一次调用最后一次）：文件“”，第 1 行，在 AttributeError 中：'PipelineModel' 对象没有属性 'save'

>>> rfModel = model.stages[8]
>>> print(rfModel)

Run Code Online (Sandbox Code Playgroud)

RandomForestClassificationModel (uid=rfc_46c07f6d7ac8) 有 20 棵树

>> rfModel.save("rfmodel")

Run Code Online (Sandbox Code Playgroud)

回溯（最近一次调用）：文件“”，第 1 行，在 AttributeError 中：'RandomForestClassificationModel' 对象没有属性 'save'**

还尝试通过传递 'sc' 作为保存方法的第一个参数。

apache-spark pyspark apache-spark-mllib

Nas*_*ood

2017 07-08

5
推荐指数

2
解决办法

2万
查看次数

标签统计

apache-spark ×1

apache-spark-mllib ×1

pyspark ×1

Pyspark ML - 如何保存管道和 RandomForestClassificationModel

标签 统计

小编Nas_ood的帖子

标签统计