sch*_*oon 7 scala neural-network apache-spark
我正在使用Spark的MultilayerPerceptronClassifier.这会在"预测"中生成"预测"列.当我尝试显示它时,我收到错误:
SparkException: Failed to execute user defined function($anonfun$1: (vector) => double) ...
Caused by: java.lang.IllegalArgumentException: requirement failed: A & B Dimension mismatch!
Run Code Online (Sandbox Code Playgroud)
其他列,例如,矢量显示OK.预测模式的一部分:
|-- vector: vector (nullable = true)
|-- prediction: double (nullable = true)
Run Code Online (Sandbox Code Playgroud)
我的代码是:
//racist is boolean, needs to be string:
val train2 = train.withColumn("racist", 'racist.cast("String"))
val test2 = test.withColumn("racist", 'racist.cast("String"))
val indexer = new StringIndexer().setInputCol("racist").setOutputCol("indexracist")
val word2Vec = new Word2Vec().setInputCol("lemma").setOutputCol("vector") //.setVectorSize(3).setMinCount(0)
val layers = Array[Int](4,5, 2)
val mpc = new MultilayerPerceptronClassifier().setLayers(layers).setBlockSize(128).setSeed(1234L).setMaxIter(100).setFeaturesCol("vector").setLabelCol("indexracist")
val pipeline = new Pipeline().setStages(Array(indexer, word2Vec, mpc))
val model = pipeline.fit(train2)
val predictions = model.transform(test2)
predictions.select("prediction").show()
Run Code Online (Sandbox Code Playgroud)
编辑提出的类似问题的问题是
val layers = Array[Int](0, 0, 0, 0)
Run Code Online (Sandbox Code Playgroud)
这不是这种情况,也不是同样的错误.
再次编辑:训练和测试的PART0保存在实木复合地板格式在这里.
添加 .setVectorSize(3).setMinCount(0) 并更改 vallayers = Array[Int](3,5, 2) 使其工作:
val word2Vec = new Word2Vec().setInputCol("lemma").setOutputCol("vector").setVectorSize(3).setMinCount(0)
// specify layers for the neural network:
// input layer of size 4 (features), two intermediate of size 5 and 4
// and output of size 3 (classes)
val layers = Array[Int](3,5, 2)
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
390 次 |
| 最近记录: |