我需要创建一个对应于https://www.w3.org/2005/xpath-functions/collation/html-ascii-case-insensitive/的Collator,即在进行比较时忽略ASCII A-Z和a-z字符的区分大小写.
我尝试使用以下ICU4j RuleBasedCollator:
final RuleBasedCollator collator =
new RuleBasedCollator("&a=A, b=B, c=C, d=D, e=E, f=F, g=G, h=H, "
+ "i=I, j=J, k=K, l=L, m=M, n=N, o=O, p=P, q=Q, r=R, s=S, t=T, "
+ "u=U, v=V, u=U, v=V, w=W, x=X, y=Y, z=Z").freeze();
Run Code Online (Sandbox Code Playgroud)
但是,以下比较似乎失败了,我希望它能成功(即返回true):
final SearchIterator searchIterator = new StringSearch(
"pu", new StringCharacterIterator("iNPut"), collator);
return searchIterator.first() >= 0;
Run Code Online (Sandbox Code Playgroud)
我的规则中缺少什么?
com.ibm.icu.text.RuleBasedCollator#compare
返回一个整数值。如果源小于目标,则值小于零;如果源和目标相等,则值为零;如果源大于目标,则值大于零
String a = "Pu";
String b = "pu";
RuleBasedCollator c1 = (RuleBasedCollator) Collator.getInstance(new Locale("en", "US", ""));
RuleBasedCollator c2 = new RuleBasedCollator("& p=P");
System.out.println(c1.compare(a, b) == 0);
System.out.println(c2.compare(a, b) == 0);
Run Code Online (Sandbox Code Playgroud)
Output
======
false
true
Run Code Online (Sandbox Code Playgroud)
看来规则并不是问题所在,SearchIterator 代码似乎有问题。
如果您不必使用 SearchIterator 那么也许您可以编写自己的“包含”方法。也许是这样的:
boolean contains(String a, String b, RuleBasedCollator c) {
int index = 0;
while (index < a.length()) {
if (a.length() < b.length()) {
return false;
}
if (c.compare(a.substring(0, b.length()), b) == 0) {
return true;
}
a = a.substring(1);
}
return false;
}
Run Code Online (Sandbox Code Playgroud)
也许不是世界上最好的代码,但你明白了。