相关疑难解决方法(0)

通过pyspark.ml CrossValidator调整隐式pyspark.ml ALS矩阵分解模型的参数

我正在尝试调整使用隐式数据的ALS矩阵分解模型的参数.为此,我正在尝试使用pyspark.ml.tuning.CrossValidator来运行参数网格并选择最佳模型.我相信我的问题在于评估者,但我无法弄明白.

我可以使用回归RMSE评估器为显式数据模型工作,如下所示:

from pyspark import SparkConf, SparkContext
from pyspark.sql import SQLContext
from pyspark.ml.recommendation import ALS
from pyspark.ml.tuning import CrossValidator, ParamGridBuilder
from pyspark.ml.evaluation import BinaryClassificationEvaluator
from pyspark.ml.evaluation import RegressionEvaluator

from pyspark.sql.functions import rand


conf = SparkConf() \
  .setAppName("MovieLensALS") \
  .set("spark.executor.memory", "2g")
sc = SparkContext(conf=conf)

sqlContext = SQLContext(sc)

dfRatings = sqlContext.createDataFrame([(0, 0, 4.0), (0, 1, 2.0), (1, 1, 3.0), (1, 2, 4.0), (2, 1, 1.0), (2, 2, 5.0)],
                                 ["user", "item", "rating"])
dfRatingsTest = sqlContext.createDataFrame([(0, 0), (0, 1), (1, 1), (1, 2), (2, …
Run Code Online (Sandbox Code Playgroud)

python apache-spark pyspark apache-spark-ml

11
推荐指数
1
解决办法
6401
查看次数

标签 统计

apache-spark ×1

apache-spark-ml ×1

pyspark ×1

python ×1