Vel*_*ozo 3 sorting hadoop mapreduce
好,
我想知道在reduce任务之后如何更改简单WordCount程序的排序顺序?我已经制作了另一张按价值而不是按键排序的地图,但它仍按升序排序.是否有一个简单的方法来执行此操作(更改排序顺序)?!
谢谢Vellozo
如果您使用的是较旧的API(mapred.*
),请在作业conf中设置OutputKeyComparatorClass:
jobConf.setOutputKeyComparatorClass(ReverseComparator.class);
Run Code Online (Sandbox Code Playgroud)
ReverseComparator可以是这样的:
static class ReverseComparator extends WritableComparator {
private static final Text.Comparator TEXT_COMPARATOR = new Text.Comparator();
public ReverseComparator() {
super(Text.class);
}
@Override
public int compare(byte[] b1, int s1, int l1, byte[] b2, int s2, int l2) {
try {
return (-1)* TEXT_COMPARATOR
.compare(b1, s1, l1, b2, s2, l2);
} catch (IOException e) {
throw new IllegalArgumentException(e);
}
}
@Override
public int compare(WritableComparable a, WritableComparable b) {
if (a instanceof Text && b instanceof Text) {
return (-1)*(((Text) a)
.compareTo((Text) b)));
}
return super.compare(a, b);
}
}
Run Code Online (Sandbox Code Playgroud)
在新的API(mapreduce.*
)中,我认为您需要使用Job.setSortComparator()方法.