Non*_*one 11 apache-spark pyspark
我们怎样才能在PySpark中获得十大推荐产品.我知道有一些方法,例如recommendedProducts为单个用户推荐产品,而且预测全部用于预测{user,item}对的评级.但是有没有一种有效的方法可以为所有用户输出每个用户的前10项?
我编写了这个函数,它通过分区将用户功能和产品功能相乘,然后分配,然后由用户获取每个产品的评级,并通过评级对其进行排序,并输出8个推荐产品的列表.
#Collect product feature matrix
productFeatures = bestModel.productFeatures().collect()
productArray=[]
productFeaturesArray=[]
for x in productFeatures:
productArray.append(x[0])
productFeaturesArray.append(x[1])
matrix=np.matrix(productFeaturesArray)
productArrayBroadCast=sc.broadcast(productArray)
productFeaturesArraybroadcast=sc.broadcast(matrix.T)
def func(iterator):
userFeaturesArray = []
userArray = []
for x in iterator:
userArray.append(x[0])
userFeaturesArray.append(x[1])
userFeatureMatrix = np.matrix(userFeaturesArray)
userRecommendationArray = userFeatureMatrix*(productFeaturesArraybroadcast.value)
mappedUserRecommendationArray = []
#Extract ratings from the matrix
i=0
for i in range(0,len(userArray)):
ratingdict={}
j=0
for j in range(0,len(productArrayBroadcast.value)):
ratingdict[str(productArrayBroadcast.value[j])]=userRecommendationArray.item((i,j))
j=j+1
#Take the top 8 recommendations for the user
sort_apps=sorted(ratingdict.keys(), key=lambda x: x[1])[:8]
sort_apps='|'.join(sort_apps)
mappedUserRecommendationArray.append((userArray[i],sort_apps))
i=i+1
return [x for x in mappedUserRecommendationArray]
recommendations=model.userFeatures().repartition(2000).mapPartitions(func)
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
4148 次 |
| 最近记录: |