Jus*_*irl 5 java hadoop mapreduce
Mapper/Reducer 1 --> (key,value)
/ | \
/ | \
Mapper/Reducer 2 | Mapper/Reducer 4
-> (oKey,oValue) | -> (xKey, xValue)
|
|
Mapper/Reducer 3
-> (aKey, aValue)
Run Code Online (Sandbox Code Playgroud)
我有一个日志文件,我与MR1聚合.Mapper2,Mapper3,Mapper4将MR1的输出作为输入.乔布斯被束缚住了.
MR1输出:
User {infos of user:[{data here},{more data},{etc}]}
..
Run Code Online (Sandbox Code Playgroud)
MR2输出:
timestamp idCount
..
Run Code Online (Sandbox Code Playgroud)
MR3输出:
timestamp loginCount
..
Run Code Online (Sandbox Code Playgroud)
MR4输出:
timestamp someCount
..
Run Code Online (Sandbox Code Playgroud)
我想结合MR2-4的输出:最终输出 - >
timestamp idCount loginCount someCount
..
..
..
Run Code Online (Sandbox Code Playgroud)
没有猪或蜂巢的方式吗?我正在使用Java.