小编tes*_*ing的帖子

结合Spark Streaming + MLlib

我试图使用随机森林模型来预测一组示例,但似乎我不能使用该模型对示例进行分类.这是pyspark中使用的代码:

sc = SparkContext(appName="App")

model = RandomForest.trainClassifier(trainingData, numClasses=2, categoricalFeaturesInfo={}, impurity='gini', numTrees=150)


ssc = StreamingContext(sc, 1)
lines = ssc.socketTextStream(hostname, int(port))

parsedLines = lines.map(parse)
parsedLines.pprint()

predictions = parsedLines.map(lambda event: model.predict(event.features))

Run Code Online (Sandbox Code Playgroud)

和在集群中编译时返回的错误:

  Error : "It appears that you are attempting to reference SparkContext from a broadcast "
    Exception: It appears that you are attempting to reference SparkContext from a broadcast variable, action, or transformation. SparkContext can only be used on the driver, not in code that it run on workers. For more information, …

Run Code Online (Sandbox Code Playgroud)

python apache-spark spark-streaming pyspark apache-spark-mllib

tes*_*ing

2016 04-25

5
推荐指数

1
解决办法

2137
查看次数