Har*_*oed 5 apache-spark apache-spark-sql pyspark apache-spark-2.0
我刚刚在hive支持下构建了Spark 2,并使用Hortonworks 2.3.4将其部署到集群中。但是我发现此Spark 2.0.3比HDP 2.3随附的标准Spark 1.5.3慢
当我检查时explain,似乎我的Spark 2.0.3没有使用钨。我需要创建特殊版本来启用钨吗?
Spark 1.5.3解释
== Physical Plan ==
TungstenAggregate(key=[id#2], functions=[], output=[id#2])
TungstenExchange hashpartitioning(id#2)
TungstenAggregate(key=[id#2], functions=[], output=[id#2])
HiveTableScan [id#2], (MetastoreRelation default, testing, None)
Run Code Online (Sandbox Code Playgroud)
火花2.0.3
== Physical Plan ==
*HashAggregate(keys=[id#2481], functions=[])
+- Exchange hashpartitioning(id#2481, 72)
+- *HashAggregate(keys=[id#2481], functions=[])
+- HiveTableScan [id#2481], MetastoreRelation default, testing
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
2001 次 |
| 最近记录: |