鉴于我的pyspark Row对象:
>>> row
Row(clicked=0, features=SparseVector(7, {0: 1.0, 3: 1.0, 6: 0.752}))
>>> row.clicked
0
>>> row.features
SparseVector(7, {0: 1.0, 3: 1.0, 6: 0.752})
>>> type(row.features)
<class 'pyspark.ml.linalg.SparseVector'>
Run Code Online (Sandbox Code Playgroud)
但是,row.features未能通过isinstance(row.features,Vector)测试.
>>> isinstance(SparseVector(7, {0: 1.0, 3: 1.0, 6: 0.752}), Vector)
True
>>> isinstance(row.features, Vector)
False
>>> isinstance(deepcopy(row.features), Vector)
False
Run Code Online (Sandbox Code Playgroud)
这个奇怪的错误让我陷入了巨大的麻烦.没有传递"isinstance(row.features,Vector)",我无法使用map函数生成LabeledPoint.如果有人能解决这个问题,我将非常感激.
apache-spark apache-spark-sql pyspark apache-spark-ml apache-spark-mllib