嗨,Hive支持多列上的不同.比如从表中选择distinct(a,b,c,d).如果没有,有没有办法实现这一目标?
作为开发人员,我通过使用现有MySQL表导入数据为我们的项目创建了HBase表sqoop job.问题是我们的数据分析师团队熟悉MySQL语法,暗示他们可以HIVE轻松查询表.对他们来说,我需要在HIVE中公开HBase表.我不想通过在HIVE中再次填充数据来复制数据.此外,复制数据将来可能会出现一致性问题.
我可以在没有重复数据的情况下暴露HIVE中的HBase表吗?如果是,我该怎么办?另外,如果insert/update/delete我的HBase表中的数据将更新数据出现在HIVE中而没有任何问题?
有时,我们的数据分析团队会在HIVE中创建表格并填充数据.我可以将它们暴露给HBase吗?如果有,怎么样?
我需要从hive结构中的所有列中选择*.
Hive创建表脚本如下所示
从表中选择*将每个结构显示为从表中选择*列
我的要求是将结构集合的所有字段显示为配置单元中的列.
用户不必单独编写列名.有没有人有UDF这样做?
我是Apache Hive的新手.在处理外部表分区时,如果我将新分区直接添加到HDFS,则在运行MSCK REPAIR表后不会添加新分区.以下是我试过的代码,
- 创建外部表
hive> create external table factory(name string, empid int, age int) partitioned by(region string)
> row format delimited fields terminated by ',';
Run Code Online (Sandbox Code Playgroud)
- 详细表信息
Location: hdfs://localhost.localdomain:8020/user/hive/warehouse/factory
Table Type: EXTERNAL_TABLE
Table Parameters:
EXTERNAL TRUE
transient_lastDdlTime 1438579844
Run Code Online (Sandbox Code Playgroud)
- 在HDFS中创建目录以加载表工厂的数据
[cloudera@localhost ~]$ hadoop fs -mkdir 'hdfs://localhost.localdomain:8020/user/hive/testing/testing1/factory1'
[cloudera@localhost ~]$ hadoop fs -mkdir 'hdfs://localhost.localdomain:8020/user/hive/testing/testing1/factory2'
Run Code Online (Sandbox Code Playgroud)
- 表数据
cat factory1.txt
emp1,500,40
emp2,501,45
emp3,502,50
cat factory2.txt
EMP10,200,25
EMP11,201,27
EMP12,202,30
Run Code Online (Sandbox Code Playgroud)
- 从本地复制到HDFS
[cloudera@localhost ~]$ hadoop fs -copyFromLocal '/home/cloudera/factory1.txt' 'hdfs://localhost.localdomain:8020/user/hive/testing/testing1/factory1'
[cloudera@localhost ~]$ hadoop fs -copyFromLocal '/home/cloudera/factory2.txt' …Run Code Online (Sandbox Code Playgroud) 嗨,我是Hive的新手,我想将当前时间戳和一行数据一起插入到我的表中.
这是我的团队表的一个例子:
team_id int
fname string
lname string
time timestamp
Run Code Online (Sandbox Code Playgroud)
我看了一些其他的例子,如何将时间戳插入Hive表?,我如何在配置单元中添加时间戳列,似乎无法使其工作.这就是我想要的:
insert into team values('101','jim','joe',from_unixtime(unix_timestamp()));
Run Code Online (Sandbox Code Playgroud)
我得到的错误是:
FAILED: SemanticException [Error 10293]: Unable to create temp file for insert values Expression of type TOK_FUNCTION not supported in insert/values
Run Code Online (Sandbox Code Playgroud)
如果有人能提供帮助,那就太棒了,非常感谢霜冻
我正在寻找像蜂巢一样的东西
Select * from table 1 where dt > (Select max(dt) from table2)
Run Code Online (Sandbox Code Playgroud)
显然,hive不支持where子句中的子查询,即使我使用连接或半连接,它只比较=而不是>(据我所知).
有人可以建议我在hive中编写相同查询的替代解决方案吗?
我有一个包含ORC文件的目录.我正在使用以下代码创建一个DataFrame
var data = sqlContext.sql("SELECT * FROM orc.`/directory/containing/orc/files`");
Run Code Online (Sandbox Code Playgroud)
它返回此架构的数据框
[_col0: int, _col1: bigint]
Run Code Online (Sandbox Code Playgroud)
预期架构在哪里
[scan_nbr: int, visit_nbr: bigint]
Run Code Online (Sandbox Code Playgroud)
当我查询镶木地板格式的文件时,我得到了正确的架构.
我错过了任何配置吗?
添加更多细节
这是Hortonworks Distribution HDP 2.4.2(Spark 1.6.1,Hadoop 2.7.1,Hive 1.2.1)
我们没有更改HDP的默认配置,但这绝对不同于Hadoop的普通版本.
数据由上游Hive作业写入,一个简单的CTAS(CREATE TABLE样本存储为ORC作为SELECT ...).
我在CTAS使用最新的2.0.0配置单元生成的文件上对此进行了测试,并保留了orc文件中的列名称.
我正在尝试使用 Hive 将日期插入到日期列中。到目前为止,这是我尝试过的
INSERT INTO table1 (EmpNo, DOB)
VALUES ('Clerk#0008000', cast(substring(from_unixtime(unix_timestamp(cast('2016-01-01' as string), 'yyyy-MM-dd')),1,10) as date));
Run Code Online (Sandbox Code Playgroud)
和
INSERT INTO table table1 values('Clerk#0008000', cast(substring(from_unixtime(unix_timestamp(cast('2016-01-01' as string), 'yyyy-MM-dd')),1,10) as date));
Run Code Online (Sandbox Code Playgroud)
和
INSERT INTO table1 SELECT
'Clerk#0008000', cast(substring(from_unixtime(unix_timestamp(cast('2016-01-01' as string), 'yyyy-MM-dd')),1,10) as date);
Run Code Online (Sandbox Code Playgroud)
但我仍然得到
FAILED: SemanticException [Error 10293]: Unable to create temp file for insert values Expression of type TOK_FUNCTION not supported in insert/values
Run Code Online (Sandbox Code Playgroud)
或者
FAILED: ParseException line 2:186 Failed to recognize predicate '<EOF>'. Failed rule: 'regularBody' in statement
Run Code Online (Sandbox Code Playgroud)
Hive ACID 已在基于 ORC …
我有一个下面给出的json文件,我想通过使用jsonserde创建外部表来使用hive访问它.
{"ResponseCode":"1","响应":"找到的数据","数据":[{"季节":"RABI","扇区":"HORTICULTURE","类别":"水果","裁剪" ":"Mango","QueryType":"文化习俗","QueryText":"如何控制芒果花落?","KCCAns":"推荐用于喷洒到planofix 5-7 mili/pump","StateName" ":"ANDHRA PRADESH","DistrictName":"NELLORE","BlockName":"BALAYAPALLE","SubmitDate":"11/1/2016 9:05:27 AM"},{"Season":"RABI" ,"部门":"农业","类别":"其他","作物":"其他","查询类型":"杂草管理","查询文本":"桉树中的杂草管理","KCCAns":"建议喷洒GLYPOSATE @ 2 LITERS PER ACRE","StateName":"ANDHRA PRADESH","DistrictName":"NELLORE","BlockName":"MARRIPADU","SubmitDate":"2016/1/11 9:07: 04 AM"},{"Season":"RABI","Sector":"HORTICULTURE","Category":"Vegetables","Crop":"Bhindi(Okra/Ladysfinger)","QueryType":"\ tPlant保护\ t","QueryText":"OKRA MITE DAMAGE","KCCAns":"建议喷洒DICOFOL 1 LITER/200 LATERERS OF WATER/ACRE","StateName":"ANDHRA PRADESH","DistrictNam e":"NELLORE","BlockName":"KOVUR","SubmitDate":"11/1/2016 3:11:59 PM"},{"季节":"RABI","部门":"农业" ,"类别":"谷物","作物":"稻田(Dhan)","QueryType":"杂草管理","QueryText":"15-20天作物的稻田杂草管理","KCCAns":"建议喷洒CYHALOPOP-P-BUTYL 250 ML PER ACRE","StateName":"ANDHRA PRADESH","DistrictName":"NELLORE","BlockName":"BALAYAPALLE","SubmitDate":"2016年11月3日2 :11:17 PM"},{"季节":"KHARIF","扇区":"农业","类别":"其他","作物":"其他","查询类型":"天气"," QueryText":"天气报告","KCCAns":"天气报告联系18004253141,08912543031","StateName":"ANDHRA PRADESH","DistrictName":"NELLORE","BlockName":"KALIGIRI","SubmitDate": "11/3/2016 5:22:22 PM"},{"季节":"RABI","扇区":"农业","类别":"豆类","作物":"木豆(红豆)/arhar/tur)","QueryType":"营养素管理","查询文字":"红色食物营养管理","KCCAns":"建议每次喷洒13-045一公斤","州名":"ANDHRA邦 "" DistrictNa 我":"NELLORE","BlockName":"ANUMASAMUDRAMPETA","SubmitDate":"2016年11月3日下午7:25:10"},{"季节":"RABI","部门":"农业" ,"类别":"豆类","作物":"木豆(红克/ arhar/tur)","QueryType":"营养素管理","QueryText":"红色食物营养管理","KCCAns": "推荐喷涂13-0-45 ONE KG/ACRE","StateName":"ANDHRA PRADESH","DistrictName":"NELLORE","BlockName":"ANUMASAMUDRAMPETA","SubmitDate":"2016/11/3 7:30:02 PM"},{"Season":"RABI","Sector":"HORTICULTURE","Category":"Fruits","Crop":"Citrus","QueryType":"\ tPlant Protection\t","QueryText":"CITRUS SUCKING PEST","KCCAns":"推荐用于DIMETHOATE 400 ML/ACRE 200 LITER WATER","StateName":"ANDHRA PRADESH","DistrictName":"NELLORE","BlockName" ":"KODAVALUR","SubmitDate":"11/4/2016 8:48:03 AM"},{"Season":"RABI","Sector":"HORTICULTURE","Category":"Fruits", "Crop":"Citrus","QueryType":"\ tPlant Protection\t","QueryText":"CITRUS SUCKING PEST","KCCAns":"推荐用于DIMETHOATE 400 ML/ACRE 200 LITER WATER","StateName" :"ANDHR …