Spark 2.1 - 实例化HiveSessionState时出错

Kri*_*ing 8 apache-spark

使用全新安装的Spark 2.1,执行pyspark命令时出错.

Traceback (most recent call last):
File "/usr/local/spark/python/pyspark/shell.py", line 43, in <module>
spark = SparkSession.builder\
File "/usr/local/spark/python/pyspark/sql/session.py", line 179, in getOrCreate
session._jsparkSession.sessionState().conf().setConfString(key, value)
File "/usr/local/spark/python/lib/py4j-0.10.4-src.zip/py4j/java_gateway.py", line 1133, in __call__
File "/usr/local/spark/python/pyspark/sql/utils.py", line 79, in deco
raise IllegalArgumentException(s.split(': ', 1)[1], stackTrace)
pyspark.sql.utils.IllegalArgumentException: u"Error while instantiating 'org.apache.spark.sql.hive.HiveSessionState':"
Run Code Online (Sandbox Code Playgroud)

我在同一台机器上安装了Hadoop和Hive.Hive配置为使用MySQL进行Metastore.我没有得到Spark 2.0.2的这个错误.

有人可以指点我正确的方向吗?

Nim*_*m J 17

我在Windows环境中遇到同样的错误,下面的技巧对我有用.

shell.pyspark会话中定义了.enableHiveSupport()

 spark = SparkSession.builder\
            .enableHiveSupport()\
            .getOrCreate()
Run Code Online (Sandbox Code Playgroud)

删除配置单元支持并重新定义spark会话,如下所示:

spark = SparkSession.builder\
        .getOrCreate()
Run Code Online (Sandbox Code Playgroud)

你可以shell.py在你的spark安装文件夹中找到.对我而言"C:\spark-2.1.1-bin-hadoop2.7\python\pyspark"

希望这可以帮助


mar*_*ita 10

我有同样的问题.一些答案sudo chmod -R 777 /tmp/hive/,或用hadoop降级到2.6的火花对我来说不起作用.我意识到导致这个问题的原因是我使用sqlContext而不是使用sparkSession进行SQL查询.

sparkSession =SparkSession.builder.master("local[*]").appName("appName").config("spark.sql.warehouse.dir", "./spark-warehouse").getOrCreate()
sqlCtx.registerDataFrameAsTable(..)
df = sparkSession.sql("SELECT ...")
Run Code Online (Sandbox Code Playgroud)

这对我来说非常适合.