DNA*_*DNA 35 java hadoop mapreduce
我的地图任务需要一些配置数据,我想通过分布式缓存分发.
Hadoop MapReduce教程显示了DistributedCache类的用法,大致如下:
// In the driver
JobConf conf = new JobConf(getConf(), WordCount.class);
...
DistributedCache.addCacheFile(new Path(filename).toUri(), conf);
// In the mapper
Path[] myCacheFiles = DistributedCache.getLocalCacheFiles(job);
...
Run Code Online (Sandbox Code Playgroud)
然而,DistributedCache在为过时标记中的Hadoop 2.2.0.
实现这一目标的新方法是什么?是否有涵盖此API的最新示例或教程?
小智 50
可以在Job类本身中找到分布式缓存的API.请查看以下文档:http://hadoop.apache.org/docs/stable2/api/org/apache/hadoop/mapreduce/Job.html 代码应该类似于
Job job = new Job();
...
job.addCacheFile(new Path(filename).toUri());
Run Code Online (Sandbox Code Playgroud)
在您的映射器代码中:
Path[] localPaths = context.getLocalCacheFiles();
...
Run Code Online (Sandbox Code Playgroud)
tol*_*gap 23
要扩展@jtravaglini,DistributedCacheYARN/MapReduce 2 的首选使用方法如下:
在你的驱动程序中,使用 Job.addCacheFile()
public int run(String[] args) throws Exception {
Configuration conf = getConf();
Job job = Job.getInstance(conf, "MyJob");
job.setMapperClass(MyMapper.class);
// ...
// Mind the # sign after the absolute file location.
// You will be using the name after the # sign as your
// file name in your Mapper/Reducer
job.addCacheFile(new URI("/user/yourname/cache/some_file.json#some"));
job.addCacheFile(new URI("/user/yourname/cache/other_file.json#other"));
return job.waitForCompletion(true) ? 0 : 1;
}
Run Code Online (Sandbox Code Playgroud)
在Mapper/Reducer中,覆盖setup(Context context)方法:
@Override
protected void setup(
Mapper<LongWritable, Text, Text, Text>.Context context)
throws IOException, InterruptedException {
if (context.getCacheFiles() != null
&& context.getCacheFiles().length > 0) {
File some_file = new File("./some");
File other_file = new File("./other");
// Do things to these two files, like read them
// or parse as JSON or whatever.
}
super.setup(context);
}
Run Code Online (Sandbox Code Playgroud)
可以在org.apache.hadoop.mapreduce.Job类中找到适用于YARN/MR2的新DistributedCache API .
Job.addCacheFile()
Run Code Online (Sandbox Code Playgroud)
不幸的是,还没有很多全面的教程式示例.
| 归档时间: |
|
| 查看次数: |
32360 次 |
| 最近记录: |