PyHive Thrift 传输异常:读取 0 字节

y2k*_*ham 6 hive thrift docker pyhive

我正在尝试使用DB-API (异步)示例通过 python (PyHive 0.5、python 2.7)连接到在 docker 容器内(从容器外部)运行的 Hive server-2

from pyhive import hive
conn = hive.connect(host='172.17.0.2', port='10001', auth='NOSASL')
Run Code Online (Sandbox Code Playgroud)

但是,我收到以下错误

Traceback (most recent call last):
  File "py_2.py", line 4, in <module>
    conn = hive.connect(host='172.17.0.2', port='10001', auth='NOSASL')
  File "/home/foodie/anaconda2/lib/python2.7/site-packages/pyhive/hive.py", line 64, in connect
    return Connection(*args, **kwargs)
  File "/home/foodie/anaconda2/lib/python2.7/site-packages/pyhive/hive.py", line 164, in __init__
    response = self._client.OpenSession(open_session_req)
  File "/home/foodie/anaconda2/lib/python2.7/site-packages/TCLIService/TCLIService.py", line 187, in OpenSession
    return self.recv_OpenSession()
  File "/home/foodie/anaconda2/lib/python2.7/site-packages/TCLIService/TCLIService.py", line 199, in recv_OpenSession
    (fname, mtype, rseqid) = iprot.readMessageBegin()
  File "/home/foodie/anaconda2/lib/python2.7/site-packages/thrift/protocol/TBinaryProtocol.py", line 148, in readMessageBegin
    name = self.trans.readAll(sz)
  File "/home/foodie/anaconda2/lib/python2.7/site-packages/thrift/transport/TTransport.py", line 60, in readAll
    chunk = self.read(sz - have)
  File "/home/foodie/anaconda2/lib/python2.7/site-packages/thrift/transport/TTransport.py", line 161, in read
    self.__rbuf = BufferIO(self.__trans.read(max(sz, self.__rbuf_size)))
  File "/home/foodie/anaconda2/lib/python2.7/site-packages/thrift/transport/TSocket.py", line 132, in read
    message='TSocket read 0 bytes')
thrift.transport.TTransport.TTransportException: TSocket read 0 bytes
Run Code Online (Sandbox Code Playgroud)

我正在使用的 docker 映像是这样的(标签:mysql_ Corrected)。它运行以下服务(由 jps 命令输出)

992 Master
1810 RunJar
259 DataNode
2611 Jps
584 ResourceManager
1576 RunJar
681 NodeManager
137 NameNode
426 SecondaryNameNode
1690 RunJar
732 HistoryServer
Run Code Online (Sandbox Code Playgroud)

我正在使用启动容器

docker run -it -p 8088:8088 -p 8042:8042 -p 4040:4040 -p 18080:18080 -p 10002:10002 -p 10000:10000 -e 3306 -e 9084 -h sandbox -v /home/foodie/docker/w1:/usr/tmp/test rohitbarnwal7/spark:mysql_corrected bash
Run Code Online (Sandbox Code Playgroud)

此外,我执行以下步骤在 docker 容器内启动 Hive 服务器

  1. 启动mysql服务:service mysqld start
  2. 切换到/usr/local/hive目录:cd $HIVE_HOME
  3. 启动 Hive 元存储服务器:nohup bin/hive --service metastore &
  4. 启动 Hive 服务器 2:(hive --service hive-server2请注意,thrift-server 端口已在 中更改为 10001 /usr/local/hive/conf/hive-site.xml
  5. 启动直线外壳:beeline
  6. 将 beeline shell 与 Hive server-2 连接:!connect jdbc:hive2://localhost:10001/default;transportMode=http;httpPath=cliservice

我已经尝试过以下操作但没有任何运气

  1. 将 python 2.7.3 设置为 docker 容器内的默认 python 版本(原始默认值是 python 2.6.6,python 2.7.3 安装在容器内但不是默认值)
  2. 将 Hive 服务器端口更改为其默认值:10000
  3. 尝试通过在容器内运行相同的 python 脚本来连接 Hive 服务器(它仍然给出相同的错误)