我在spark中运行word2vec,当涉及到时fit(),在UI中只观察到一个任务,如图像:
fit()
.
根据配置,num-executors = 1000, executor-cores = 2.RDD合并到2000个分区.这需要相当长的时间mapPartitionsWithIndex.它可以分发给多个执行者或任务吗?
num-executors = 1000, executor-cores = 2
mapPartitionsWithIndex
scala apache-spark word2vec apache-spark-mllib
apache-spark ×1
apache-spark-mllib ×1
scala ×1
word2vec ×1