使用Python连接到Hive2时使用以下代码:
import pyhs2
with pyhs2.connect(host='localhost',
           port=10000,
           authMechanism="PLAIN",
           user='root',
           password='test',
           database='default') as conn:
with conn.cursor() as cur:
    #Show databases
    print cur.getDatabases()
    #Execute query
    cur.execute("select * from table")
    #Return column info from query
    print cur.getSchema()
    #Fetch table results
    for i in cur.fetch():
        print i
我收到以下错误:
File
"C:\Users\vinbhask\AppData\Roaming\Python\Python36\site-packages\pyhs2-0.6.0-py3.6.egg\pyhs2\connections.py",
line 7, in <module>
    from cloudera.thrift_sasl import TSaslClientTransport ModuleNotFoundError: No module named 'cloudera'
这是迄今为止安装的软件包:
bitarray0.8.1,certifi2017.7.27.1,chardet3.0.4,cm-api16.0.0,cx-Oracle6.0.1,future0.16.0,idna2.6,impyla0.14.0,JayDeBeApi1.1.1,JPype10.6.2,ply3.10,pure-sasl0.4.0,PyHive0.4.0,pyhs20.6.0,pyodbc4.0.17,requests2.18.4,sasl0.2.1,six1.10.0,teradata15.10.0.21,thrift0.10.0,thrift-sasl0.2.1,thriftpy0.3.9,urllib31.22
使用Impyla时出错:
Traceback (most recent call last):
File "C:\Users\xxxxx\AppData\Local\Programs\Python\Python36-32\Scripts\HiveConnTester4.py", line 1, in <module>
from impala.dbapi import connect …我需要通过JDBC从Java程序连接到Hive.我搜索了谷歌,发现了许多这样的指南和示例: HiveServer2客户端
但是,我无法在任何地方找到JDBC驱动程序本身(jar文件).似乎有一个jar文件可以从Cloudera下载,但它需要注册.
有谁知道从哪里获得普通的Apache Hive JDBC驱动程序本身?
我正在使用CentOS 6.5,并希望访问远程配置单元服务器.但是我无法使用安装pyhs2 pip install pyhs2.
我已经安装了所有必需的依赖项:
但我仍然得到同样的错误:
Failed building wheel for sasl
Failed to build sasl
Installing collected packages: sasl, pyhs2
Running setup.py install for sasl
Complete output from command /usr/local/bin/python3 -c "import setuptools, tokenize;__file__='/tmp/pip-build-vq9qfls4/sasl/setup.py';exec(compile(getattr(tokenize, 'open', open)(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" install --record /tmp/pip-bjam3ra0-record/install-record.txt --single-version-externally-managed --compile:
running install
running build
running build_py
running egg_info
writing top-level names to sasl.egg-info/top_level.txt
writing dependency_links to sasl.egg-info/dependency_links.txt
writing sasl.egg-info/PKG-INFO
warning: manifest_maker: standard file '-c' not found
reading manifest file 'sasl.egg-info/SOURCES.txt'
reading …使用此链接尝试连接到远程配置单元。下面是使用的代码。下面还给出了收到的错误消息
代码
   from pyhive import hive
    conn = hive.Connection(host="10.111.22.11", port=10000, username="user1" ,database="default")
错误信息
Could not connect to any of [('10.111.22.11', 10000)]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/opt/anaconda3/lib/python3.6/site-packages/pyhive/hive.py", line 131, in __init__
    self._transport.open()
  File "/opt/anaconda3/lib/python3.6/site-packages/thrift_sasl/__init__.py", line 61, in open
    self._trans.open()
  File "/opt/anaconda3/lib/python3.6/site-packages/thrift/transport/TSocket.py",line 113, in open
    raise TTransportException(TTransportException.NOT_OPEN, msg)
thrift.transport.TTransport.TTransportException: Could not connect to any of [('10.111.22.11', 10000)]
成功连接还需要什么条件?我能够直接连接到服务器(使用 putty)并运行配置单元。但是当从另一台服务器 X 尝试时,我收到此错误。我也可以从服务器 X ping 配置单元服务器。
端口号可能是问题吗?如何检查正确的端口号?
正如下面的答案中所讨论的,我尝试启动 hiveserver2 。但该命令似乎不起作用。非常感谢任何帮助。
当我从 hive shell 执行查询时,我在日志中看到的端口是8088 …
我正在使用pyhs2连接Hive。但是Hive服务器需要Kerberos身份验证。有人知道如何将JDBC字符串转换为pyhs2参数吗?喜欢:
jdbc:hive2://biclient2.server.163.org:10000/default;principal=hive/app-20.photo.163.org@HADOOP.HZ.NETEASE.COM?mapred.job.queue.name=default