所以我有制表符分隔数据的外部表.一个简单的表看起来像这样:
create external table if not exists categories
(id string, tag string, legid string, image string, parent string, created_date string, time_stamp int)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
LOCATION 's3n://somewhere/';
Run Code Online (Sandbox Code Playgroud)
现在我在最后添加另一个字段,它将是逗号分隔的值列表.
有没有办法以与指定字段终止符相同的方式指定它,或者我是否必须依赖其中一个serdes?
例如:
...list_of_names ARRAY<String>)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' ARRAY ELEMENTS SEPARATED BY ','
...
Run Code Online (Sandbox Code Playgroud)
(我假设我需要使用一个serde,但我认为这没有任何伤害)
我不知道如何更新现有表来做到这一点,但是用于创建表; 您可以在https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL上找到您正在寻找的内容.那里的一个片段
row_format
: DELIMITED [FIELDS TERMINATED BY char] [COLLECTION ITEMS TERMINATED BY char]
[MAP KEYS TERMINATED BY char] [LINES TERMINATED BY char]
Run Code Online (Sandbox Code Playgroud)
我们创建表的一个例子是
CREATE TABLE IF NOT EXISTS visits
(
... Columns Removed...
)
PARTITIONED BY (userdate STRING)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\001'
COLLECTION ITEMS TERMINATED BY '\002'
MAP KEYS TERMINATED BY '\003'
STORED AS TEXTFILE
;
Run Code Online (Sandbox Code Playgroud)
你正在寻找的那条线是COLLECTION ITEMS TERMINATED BY char一个阵列.
心连心
| 归档时间: |
|
| 查看次数: |
8537 次 |
| 最近记录: |