hot*_*oup 3 java sorting collections junit
请注意:我在这里提到了 JUnit 并提供了一个使用它的SSCCE代码示例,但这实际上是一个 Java 集合问题,任何有 Java 经验的人都可以回答,无论他们使用 JUnit 的经验如何。
Java 8 在这里,我正在尝试对字符串列表进行排序,但我从中得到了一些意想不到的行为Collections.sort(myList)
,我想知道发生了什么。
这是我的完整单元测试:
@RunWith(MockitoJUnitRunner.class)
public class SorterTest {
@Test
public void should_sort_correctly_including_capitalization_rules() {
// given
String[] actualNames = new String[] {
"DCME",
"CCME",
"ACME",
"BCME",
"AGME",
"AACME",
"aCME",
"Acme",
"AaCME",
"aACME",
};
List<String> actual = Arrays.asList(actualNames);
// the order I would *expect* them to sort into...
String[] expectedNames = new String[] {
"aACME",
"aCME",
"AaCME",
"AACME",
"Acme",
"ACME",
"AGME",
"BCME",
"CCME",
"DCME"
};
List<String> expected = Arrays.asList(expectedNames);
// when
Collections.sort(actual);
// then
assertTrue(actual.equals(expected));
}
}
Run Code Online (Sandbox Code Playgroud)
assertTrue
这里的 JUnit在运行时失败,因为actual
列表被排序为:
0 = "AACME"
1 = "ACME"
2 = "AGME"
3 = "AaCME"
4 = "Acme"
5 = "BCME"
6 = "CCME"
7 = "DCME"
8 = "aACME"
9 = "aCME"
Run Code Online (Sandbox Code Playgroud)
那是 ^^^ 调试器输出,数字代表每个元素的列表索引。
因此,出于某种原因Collections.sort
,字符串“BCME”在词典上比“aCME”“低”(将在排序列表中更早出现),这对我来说简直就是疯子。:-)
我应该提到,我将在这里只处理 UTF-8 中的 ASCII 字符,但我的应用程序将执行预验证,以确保我们每个字符串/名称中的所有字符都在[a-z][A-Z]
.
无论哪种方式,我正在寻找要使用的 Java 代码的排序规则是:
aAbBcCdDeEfFgGhHiIjJkKlLmMnNoOpPqQrRsStTuUvVwWxXyYzZ
鉴于这些排序规则,我的单元测试中的列表应该排序为:
Sort Order Reason why it comes after the last one in the list
================================================================
aACME
aCME 1st letter is 'a' but 2nd letter is 'C' and A < C
AaCME 1st letter is 'A' and a < A
AACME 1st letter is 'A' and 2nd letter is 'A' and a < A
Acme 1st letter is 'A' but 2nd letter is 'c' and A < c
ACME 1st letter is 'A' but 2nd letter is 'C' and c < C
AGME 1st letter is 'A' but 2nd letter is 'G' and C < G
BCME 1st letter is 'B' and aA < bB
CCME 1st letter is 'C' and bB < cC
DCME 1st letter is 'D' and cC < dD
Run Code Online (Sandbox Code Playgroud)
如何更改上面的代码以便单元测试通过并且列表按我需要的方式排序?
Java有类RuleBasedCollator
,允许自定义字符排序/排序。
在这种情况下,小写字母应该在大写字母之前,因此规则可能如下所示:
static RuleBasedCollator lowerFirst() {
try {
return new RuleBasedCollator(
"< a < A < b < B < c < C < d < D < e < E < f < F < g < G < h < H < i < I < j < J < "
+ "k < K < l < L < m < M < n < N < o < O < p < P < q < Q < r < R < s < S < t < T < "
+ "u < U < w < W < x < X < y < Y < z < Z"
);
} catch (ParseException parsex) {
throw new IllegalArgumentException("Failed to create lowerFirst collator", parsex);
}
}
Run Code Online (Sandbox Code Playgroud)
测试:
String[] names = new String[] {
"DCME", "CCME", "ACME", "BCME", "AGME",
"AACME", "aCME", "Acme", "AaCME", "aACME",
};
String[] expected = new String[] {
"aACME", "aCME", "AaCME", "AACME", "Acme",
"ACME", "AGME", "BCME", "CCME", "DCME"
};
Arrays.sort(names, lowerFirst());
System.out.println("sorted: " + Arrays.toString(names));
System.out.println("expected: " + Arrays.toString(expected));
Run Code Online (Sandbox Code Playgroud)
输出
sorted: [aACME, aCME, AaCME, AACME, Acme, ACME, AGME, BCME, CCME, DCME]
expected: [aACME, aCME, AaCME, AACME, Acme, ACME, AGME, BCME, CCME, DCME]
Run Code Online (Sandbox Code Playgroud)