如何在scala中设置spark.sql.pivotMaxValues?

Leo*_*god 1 scala apache-spark

这可能是一个愚蠢的问题,但是当我尝试在 databricks 中进行 pivit 时如何设置spark.sql.pivotMaxValues?我遇到了这个巨大的错误,org.apache.spark.sql.AnalysisException: The pivot column census_block_group has more than 10000 distinct values, this could indicate an error. If this was intended, set spark.sql.pivotMaxValues to at least the number of distinct values of the pivot column.; 所以有人知道我该如何解决这个问题吗?

import org.apache.spark.sql.SQLContext

 val df = censusBlocks.toDF
df.groupBy("B08007e1").pivot("census_block_group").sum("B08008e4")
df.show()
Run Code Online (Sandbox Code Playgroud)

D3V*_*D3V 7

你可以设置它

spark.conf.set("spark.sql.pivotMaxValues", 10000)
Run Code Online (Sandbox Code Playgroud)