在新的API(apache.hadoop.mapreduce.KeyValueTextInputFormat)中,如何指定除tab之外的分隔符(分隔符)(默认值)以分隔键和值.
样本输入:
one,first line
two,second line
Run Code Online (Sandbox Code Playgroud)
需要输出:
Key : one
Value : first line
Key : two
Value : second line
Run Code Online (Sandbox Code Playgroud)
我将KeyValueTextInputFormat指定为:
Job job = new Job(conf, "Sample");
job.setInputFormatClass(KeyValueTextInputFormat.class);
KeyValueTextInputFormat.addInputPath(job, new Path("/home/input.txt"));
Run Code Online (Sandbox Code Playgroud)
这适用于tab作为分隔符.