Aym*_*hal 6 scala apache-spark apache-spark-ml
我正在使用 Kmeans 作为聚类算法,我的代码想要执行并向我显示此错误:
org.apache.spark.SparkException: Failed to execute user defined function(VectorAssembler$$Lambda$1525/671078904: (struct<latitude:double,longitude:double>) => struct<type:tinyint,size:int,indices:array<int>,values:array<double>>)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source)
Run Code Online (Sandbox Code Playgroud)
这是数据框代码:
org.apache.spark.SparkException: Failed to execute user defined function(VectorAssembler$$Lambda$1525/671078904: (struct<latitude:double,longitude:double>) => struct<type:tinyint,size:int,indices:array<int>,values:array<double>>)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source)
Run Code Online (Sandbox Code Playgroud)
对于架构它是有效的,但如果我把节目放出来,我就会遇到问题。
tri*_*ta2 12
这个问题很老了,但我刚刚遇到了这个问题pyspark。
我认为该错误与数据中的空值有关。fillna()在使用之前在我的列上执行 aVectorAssembler解决了错误。