相关疑难解决方法(0)

如何通过Python访问Hive?

https://cwiki.apache.org/confluence/display/Hive/HiveClient#HiveClient-Python似乎已过时.

当我将其添加到/ etc/profile时:

export PYTHONPATH=$PYTHONPATH:/usr/lib/hive/lib/py
Run Code Online (Sandbox Code Playgroud)

然后,我可以执行链接中列出的导入,from hive import ThriftHive但实际需要的除外:

from hive_service import ThriftHive
Run Code Online (Sandbox Code Playgroud)

接下来示例中的端口是10000,当我尝试时导致程序挂起.默认的Hive Thrift端口是9083,它停止了悬挂.

所以我这样设置:

from thrift import Thrift
from thrift.transport import TSocket
from thrift.transport import TTransport
from thrift.protocol import TBinaryProtocol
try:
    transport = TSocket.TSocket('<node-with-metastore>', 9083)
    transport = TTransport.TBufferedTransport(transport)
    protocol = TBinaryProtocol.TBinaryProtocol(transport)
    client = ThriftHive.Client(protocol)
    transport.open()
    client.execute("CREATE TABLE test(c1 int)")

    transport.close()
except Thrift.TException, tx:
    print '%s' % (tx.message)
Run Code Online (Sandbox Code Playgroud)

我收到以下错误:

Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/hive/lib/py/hive_service/ThriftHive.py", line 68, in execute …
Run Code Online (Sandbox Code Playgroud)

python hadoop hive

44
推荐指数
7
解决办法
15万
查看次数

标签 统计

hadoop ×1

hive ×1

python ×1