sch*_*oon 12 hive hortonworks-data-platform apache-spark apache-zeppelin
我使用的是HDP-2.6.0.3,但我需要Zeppelin 0.8,所以我已将其作为独立服务安装.当我跑:
%sql
show tables
Run Code Online (Sandbox Code Playgroud)
我什么都没回来,当我运行Spark2 SQL命令时,我得到'table not found'.表格可以在0.7 Zeppelin中看到,它是HDP的一部分.
任何人都可以告诉我我失踪了什么,让Zeppelin/Spark看到Hive?
我为创建zep0.8而执行的步骤如下:
maven clean package -DskipTests -Pspark-2.1 -Phadoop-2.7-Dhadoop.version=2.7.3 -Pyarn -Ppyspark -Psparkr -Pr -Pscala-2.11
Run Code Online (Sandbox Code Playgroud)
将/usr/hdp/2.6.0.3-8/zeppelin/conf中的zeppelin-site.xml和shiro.ini复制到/ home/ed/zeppelin/conf.
创建了/home/ed/zeppelin/conf/zeppeli-env.sh,其中我提出了以下内容:
export JAVA_HOME=/usr/jdk64/jdk1.8.0_112
export HADOOP_CONF_DIR=/etc/hadoop/conf
export ZEPPELIN_JAVA_OPTS="-Dhdp.version=2.6.0.3-8"
Run Code Online (Sandbox Code Playgroud)
将/etc/hive/conf/hive-site.xml复制到/ home/ed/zeppelin/conf
编辑:我也尝试过:
import org.apache.spark.sql.SparkSession
val spark = SparkSession
.builder()
.appName("interfacing spark sql to hive metastore without configuration file")
.config("hive.metastore.uris", "thrift://s2.royble.co.uk:9083") // replace with your hivemetastore service's thrift url
.config("url", "jdbc:hive2://s2.royble.co.uk:10000/default")
.config("UID", "admin")
.config("PWD", "admin")
.config("driver", "org.apache.hive.jdbc.HiveDriver")
.enableHiveSupport() // don't forget to enable hive support
.getOrCreate()
Run Code Online (Sandbox Code Playgroud)
相同的结果,和:
import java.sql.{DriverManager, Connection, Statement, ResultSet}
val url = "jdbc:hive2://"
val driver = "org.apache.hive.jdbc.HiveDriver"
val user = "admin"
val password = "admin"
Class.forName(driver).newInstance
val conn: Connection = DriverManager.getConnection(url, user, password)
Run Code Online (Sandbox Code Playgroud)
这使:
java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
ERROR XSDB6: Another instance of Derby may have already booted the database /home/ed/metastore_db
Run Code Online (Sandbox Code Playgroud)
修复错误:
val url = "jdbc:hive2://s2.royble.co.uk:10000"
Run Code Online (Sandbox Code Playgroud)
但仍然没有表:(
这有效:
import java.sql.{DriverManager, Connection, Statement, ResultSet}
val url = "jdbc:hive2://s2.royble.co.uk:10000"
val driver = "org.apache.hive.jdbc.HiveDriver"
val user = "admin"
val password = "admin"
Class.forName(driver).newInstance
val conn: Connection = DriverManager.getConnection(url, user, password)
val r: ResultSet = conn.createStatement.executeQuery("SELECT * FROM tweetsorc0")
Run Code Online (Sandbox Code Playgroud)
但随后我很难将结果集转换为数据帧。我宁愿 SparkSession 工作并且得到一个数据框,所以我今天晚些时候会添加赏金。
| 归档时间: |
|
| 查看次数: |
556 次 |
| 最近记录: |