我想创建一个带有多字符串字符的HIVE表作为分隔符,例如
CREATE EXTERNAL TABlE tableex(id INT, name STRING)
ROW FORMAT delimited fields terminated by ','
LINES TERMINATED BY '\n' STORED AS TEXTFILE LOCATION '/user/myusername';
Run Code Online (Sandbox Code Playgroud)
我希望将分隔符设置为像"〜*"这样的多字符串.
Har*_*non 10
FILELDS TERMINATED BY不支持多字符分隔符.最简单的方法是使用RegexSerDe:
CREATE EXTERNAL TABlE tableex(id INT, name STRING)
ROW FORMAT 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe'
WITH SERDEPROPERTIES (
"input.regex" = "^(\\d+)~\\*(.*)$"
)
STORED AS TEXTFILE
LOCATION '/user/myusername';
Run Code Online (Sandbox Code Playgroud)
请使用MultiDelimitSerde
CREATE EXTERNAL TABlE tableex(id INT, name STRING)
ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe'
WITH SERDEPROPERTIES ("field.delim"="~*")
STORED AS TEXTFILE
LOCATION '/user/myusername';
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
21710 次 |
| 最近记录: |