嗨,Hive支持多列上的不同.比如从表中选择distinct(a,b,c,d).如果没有,有没有办法实现这一目标?
作为开发人员,我通过使用现有MySQL表导入数据为我们的项目创建了HBase表sqoop job.问题是我们的数据分析师团队熟悉MySQL语法,暗示他们可以HIVE轻松查询表.对他们来说,我需要在HIVE中公开HBase表.我不想通过在HIVE中再次填充数据来复制数据.此外,复制数据将来可能会出现一致性问题.
我可以在没有重复数据的情况下暴露HIVE中的HBase表吗?如果是,我该怎么办?另外,如果insert/update/delete我的HBase表中的数据将更新数据出现在HIVE中而没有任何问题?
有时,我们的数据分析团队会在HIVE中创建表格并填充数据.我可以将它们暴露给HBase吗?如果有,怎么样?
我是Apache Hive的新手.在处理外部表分区时,如果我将新分区直接添加到HDFS,则在运行MSCK REPAIR表后不会添加新分区.以下是我试过的代码,
- 创建外部表
hive> create external table factory(name string, empid int, age int) partitioned by(region string)
> row format delimited fields terminated by ',';
Run Code Online (Sandbox Code Playgroud)
- 详细表信息
Location: hdfs://localhost.localdomain:8020/user/hive/warehouse/factory
Table Type: EXTERNAL_TABLE
Table Parameters:
EXTERNAL TRUE
transient_lastDdlTime 1438579844
Run Code Online (Sandbox Code Playgroud)
- 在HDFS中创建目录以加载表工厂的数据
[cloudera@localhost ~]$ hadoop fs -mkdir 'hdfs://localhost.localdomain:8020/user/hive/testing/testing1/factory1'
[cloudera@localhost ~]$ hadoop fs -mkdir 'hdfs://localhost.localdomain:8020/user/hive/testing/testing1/factory2'
Run Code Online (Sandbox Code Playgroud)
- 表数据
cat factory1.txt
emp1,500,40
emp2,501,45
emp3,502,50
cat factory2.txt
EMP10,200,25
EMP11,201,27
EMP12,202,30
Run Code Online (Sandbox Code Playgroud)
- 从本地复制到HDFS
[cloudera@localhost ~]$ hadoop fs -copyFromLocal '/home/cloudera/factory1.txt' 'hdfs://localhost.localdomain:8020/user/hive/testing/testing1/factory1'
[cloudera@localhost ~]$ hadoop fs -copyFromLocal '/home/cloudera/factory2.txt' …Run Code Online (Sandbox Code Playgroud) 我需要从hive结构中的所有列中选择*.
Hive创建表脚本如下所示
从表中选择*将每个结构显示为从表中选择*列
我的要求是将结构集合的所有字段显示为配置单元中的列.
用户不必单独编写列名.有没有人有UDF这样做?
我正在寻找像蜂巢一样的东西
Select * from table 1 where dt > (Select max(dt) from table2)
Run Code Online (Sandbox Code Playgroud)
显然,hive不支持where子句中的子查询,即使我使用连接或半连接,它只比较=而不是>(据我所知).
有人可以建议我在hive中编写相同查询的替代解决方案吗?
嗨,我是Hive的新手,我想将当前时间戳和一行数据一起插入到我的表中.
这是我的团队表的一个例子:
team_id int
fname string
lname string
time timestamp
Run Code Online (Sandbox Code Playgroud)
我看了一些其他的例子,如何将时间戳插入Hive表?,我如何在配置单元中添加时间戳列,似乎无法使其工作.这就是我想要的:
insert into team values('101','jim','joe',from_unixtime(unix_timestamp()));
Run Code Online (Sandbox Code Playgroud)
我得到的错误是:
FAILED: SemanticException [Error 10293]: Unable to create temp file for insert values Expression of type TOK_FUNCTION not supported in insert/values
Run Code Online (Sandbox Code Playgroud)
如果有人能提供帮助,那就太棒了,非常感谢霜冻
我正在尝试使用 Hive 将日期插入到日期列中。到目前为止,这是我尝试过的
INSERT INTO table1 (EmpNo, DOB)
VALUES ('Clerk#0008000', cast(substring(from_unixtime(unix_timestamp(cast('2016-01-01' as string), 'yyyy-MM-dd')),1,10) as date));
Run Code Online (Sandbox Code Playgroud)
和
INSERT INTO table table1 values('Clerk#0008000', cast(substring(from_unixtime(unix_timestamp(cast('2016-01-01' as string), 'yyyy-MM-dd')),1,10) as date));
Run Code Online (Sandbox Code Playgroud)
和
INSERT INTO table1 SELECT
'Clerk#0008000', cast(substring(from_unixtime(unix_timestamp(cast('2016-01-01' as string), 'yyyy-MM-dd')),1,10) as date);
Run Code Online (Sandbox Code Playgroud)
但我仍然得到
FAILED: SemanticException [Error 10293]: Unable to create temp file for insert values Expression of type TOK_FUNCTION not supported in insert/values
Run Code Online (Sandbox Code Playgroud)
或者
FAILED: ParseException line 2:186 Failed to recognize predicate '<EOF>'. Failed rule: 'regularBody' in statement
Run Code Online (Sandbox Code Playgroud)
Hive ACID 已在基于 ORC …
我有一个包含ORC文件的目录.我正在使用以下代码创建一个DataFrame
var data = sqlContext.sql("SELECT * FROM orc.`/directory/containing/orc/files`");
Run Code Online (Sandbox Code Playgroud)
它返回此架构的数据框
[_col0: int, _col1: bigint]
Run Code Online (Sandbox Code Playgroud)
预期架构在哪里
[scan_nbr: int, visit_nbr: bigint]
Run Code Online (Sandbox Code Playgroud)
当我查询镶木地板格式的文件时,我得到了正确的架构.
我错过了任何配置吗?
添加更多细节
这是Hortonworks Distribution HDP 2.4.2(Spark 1.6.1,Hadoop 2.7.1,Hive 1.2.1)
我们没有更改HDP的默认配置,但这绝对不同于Hadoop的普通版本.
数据由上游Hive作业写入,一个简单的CTAS(CREATE TABLE样本存储为ORC作为SELECT ...).
我在CTAS使用最新的2.0.0配置单元生成的文件上对此进行了测试,并保留了orc文件中的列名称.
我有一个下面给出的json文件,我想通过使用jsonserde创建外部表来使用hive访问它.
{"ResponseCode":"1","响应":"找到的数据","数据":[{"季节":"RABI","扇区":"HORTICULTURE","类别":"水果","裁剪" ":"Mango","QueryType":"文化习俗","QueryText":"如何控制芒果花落?","KCCAns":"推荐用于喷洒到planofix 5-7 mili/pump","StateName" ":"ANDHRA PRADESH","DistrictName":"NELLORE","BlockName":"BALAYAPALLE","SubmitDate":"11/1/2016 9:05:27 AM"},{"Season":"RABI" ,"部门":"农业","类别":"其他","作物":"其他","查询类型":"杂草管理","查询文本":"桉树中的杂草管理","KCCAns":"建议喷洒GLYPOSATE @ 2 LITERS PER ACRE","StateName":"ANDHRA PRADESH","DistrictName":"NELLORE","BlockName":"MARRIPADU","SubmitDate":"2016/1/11 9:07: 04 AM"},{"Season":"RABI","Sector":"HORTICULTURE","Category":"Vegetables","Crop":"Bhindi(Okra/Ladysfinger)","QueryType":"\ tPlant保护\ t","QueryText":"OKRA MITE DAMAGE","KCCAns":"建议喷洒DICOFOL 1 LITER/200 LATERERS OF WATER/ACRE","StateName":"ANDHRA PRADESH","DistrictNam e":"NELLORE","BlockName":"KOVUR","SubmitDate":"11/1/2016 3:11:59 PM"},{"季节":"RABI","部门":"农业" ,"类别":"谷物","作物":"稻田(Dhan)","QueryType":"杂草管理","QueryText":"15-20天作物的稻田杂草管理","KCCAns":"建议喷洒CYHALOPOP-P-BUTYL 250 ML PER ACRE","StateName":"ANDHRA PRADESH","DistrictName":"NELLORE","BlockName":"BALAYAPALLE","SubmitDate":"2016年11月3日2 :11:17 PM"},{"季节":"KHARIF","扇区":"农业","类别":"其他","作物":"其他","查询类型":"天气"," QueryText":"天气报告","KCCAns":"天气报告联系18004253141,08912543031","StateName":"ANDHRA PRADESH","DistrictName":"NELLORE","BlockName":"KALIGIRI","SubmitDate": "11/3/2016 5:22:22 PM"},{"季节":"RABI","扇区":"农业","类别":"豆类","作物":"木豆(红豆)/arhar/tur)","QueryType":"营养素管理","查询文字":"红色食物营养管理","KCCAns":"建议每次喷洒13-045一公斤","州名":"ANDHRA邦 "" DistrictNa 我":"NELLORE","BlockName":"ANUMASAMUDRAMPETA","SubmitDate":"2016年11月3日下午7:25:10"},{"季节":"RABI","部门":"农业" ,"类别":"豆类","作物":"木豆(红克/ arhar/tur)","QueryType":"营养素管理","QueryText":"红色食物营养管理","KCCAns": "推荐喷涂13-0-45 ONE KG/ACRE","StateName":"ANDHRA PRADESH","DistrictName":"NELLORE","BlockName":"ANUMASAMUDRAMPETA","SubmitDate":"2016/11/3 7:30:02 PM"},{"Season":"RABI","Sector":"HORTICULTURE","Category":"Fruits","Crop":"Citrus","QueryType":"\ tPlant Protection\t","QueryText":"CITRUS SUCKING PEST","KCCAns":"推荐用于DIMETHOATE 400 ML/ACRE 200 LITER WATER","StateName":"ANDHRA PRADESH","DistrictName":"NELLORE","BlockName" ":"KODAVALUR","SubmitDate":"11/4/2016 8:48:03 AM"},{"Season":"RABI","Sector":"HORTICULTURE","Category":"Fruits", "Crop":"Citrus","QueryType":"\ tPlant Protection\t","QueryText":"CITRUS SUCKING PEST","KCCAns":"推荐用于DIMETHOATE 400 ML/ACRE 200 LITER WATER","StateName" :"ANDHR …