如何对自定义的RecordReader和InputFormat类进行单元测试?

She*_*har 3 java unit-testing hadoop mapreduce

我已经开发了一种map-reduce程序。我写了习俗RecordReaderInputFormat课程。

我正在使用mapper和reducer MR Unit并对其Mockito进行单元测试。

我想知道如何对定制RecordReaderInputFormat类进行单元测试?测试这些类的最优选方法是什么?

小智 5

感谢user7610

答案中示例代码的经过编译和某种程度上的测试

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.mapreduce.InputFormat;
import org.apache.hadoop.mapreduce.RecordReader;
import org.apache.hadoop.mapreduce.TaskAttemptContext;
import org.apache.hadoop.mapreduce.TaskAttemptID;
import org.apache.hadoop.mapreduce.lib.input.FileSplit;
import org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl;
import org.apache.hadoop.util.ReflectionUtils;
import java.io.File;

Configuration conf = new Configuration(false);
conf.set("fs.default.name", "file:///");

File testFile = new File("path/to/file");
Path path = new Path(testFile.getAbsoluteFile().toURI());
FileSplit split = new FileSplit(path, 0, testFile.length(), null);

InputFormat inputFormat = ReflectionUtils.newInstance(MyInputFormat.class, conf);
TaskAttemptContext context = new TaskAttemptContextImpl(conf, new TaskAttemptID());
RecordReader reader = inputFormat.createRecordReader(split, context);

reader.initialize(split, context);
Run Code Online (Sandbox Code Playgroud)