HIVE使用json格式插入覆盖目录

Jih*_* No 9 json hadoop hive hiveql

如何使用json架构插入覆盖目录?

有生蜂巢avro表; (这实际上有很多领域)

tb_test--------
name string
kickname string
-----------------
Run Code Online (Sandbox Code Playgroud)

然后我想通过jsonserde将查询结果保存到hdfs中的某个目录中.

我试过这个.

insert overwrite directory '/json/'
ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe'
WITH SERDEPROPERTIES (
 "stat_name"="$._col0",
 "stat_interval"="$._col1"
)
STORED AS TEXTFILE 
select name, nickname
from tb_test limit 100
Run Code Online (Sandbox Code Playgroud)

但在/ json /中写入json有_colXX字段名而不是原始字段名.

{"_col0":"basic_qv"," _col1":"h"}
{"_col0":"basic_qv","_col1 ":"h"}
{"_col0":"basic_qv","_col1 ":"h"}
{"_col0":"basic_qv"," _col1":"h"}
{"_col0":"basic_qv","_col1 ":"h"}
Run Code Online (Sandbox Code Playgroud)

我期望

{"name":"basic_qv","nickname":"h"}
{"name":"basic_qv","nickname":"h"}
{"name":"basic_qv","nickname":"h"}
{"name":"basic_qv","nickname":"h"}
{"name":"basic_qv","nickname":"h"}
Run Code Online (Sandbox Code Playgroud)

有什么用呢?

谢谢!!

lef*_*oin 2

似乎您的解决方法问题(使用带有named_struct的JsonUDF)在这里描述: https: //github.com/rcongiu/Hive-JSON-Serde/issues/151

extract.hql:
add jar /home/myuser/lib/json-udf-1.3.8-SNAPSHOT-jar-with-dependencies.jar;
create temporary function tjson as 'org.openx.data.udf.JsonUDF';

insert overwrite local directory '/json/'
select
tjson(named_struct("name", t.name,"nickname", t.nickname))
from tb_test t
;
Run Code Online (Sandbox Code Playgroud)

您还可以创建基于 JsonSerDe 的表并定义列,insert overwrite并使用表位置而不是目录。