在Hadoop中使用RecordReader

Question

在Hadoop中使用RecordReader

任何人都可以解释RecordReader如何实际工作？这些方法如何nextkeyvalue(),getCurrentkey()并getprogress()在程序开始执行后工作？

Answer 1

(新API):默认的Mapper类有一个run方法,如下所示:

public void run(Context context) throws IOException, InterruptedException {
    setup(context);
    while (context.nextKeyValue()) {
        map(context.getCurrentKey(), context.getCurrentValue(), context);
    }
    cleanup(context);
}

Run Code Online (Sandbox Code Playgroud)

的Context.nextKeyValue(),Context.getCurrentKey()和Context.getCurrentValue()方法的包装RecordReader方法.查看源文件src/mapred/org/apache/hadoop/mapreduce/MapContext.java.

所以这个循环执行并调用Mapper实现的map(K, V, Context)方法.

具体来说,您还想知道什么？

getProgress()怎么样？ (2认同)

归档时间：	13 年，8 月前
查看次数：	5712 次
最近记录：	8 年，9 月前