Arn*_*nav 9 scala apache-spark rdd
如何单独打印特定分区的元素,比如说第5个?
val distData = sc.parallelize(1 to 50, 10)
Run Code Online (Sandbox Code Playgroud)
使用Spark/Scala:
val data = 1 to 50
val distData = sc.parallelize(data,10)
distData.mapPartitionsWithIndex( (index: Int, it: Iterator[Int]) =>it.toList.map(x => if (index ==5) {println(x)}).iterator).collect
Run Code Online (Sandbox Code Playgroud)
生产:
26
27
28
29
30
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
13924 次 |
| 最近记录: |