Pau*_*lin 73 java sorting string algorithm comparison
我需要编写一个Java Comparator类来比较Strings,但是有一个转折.如果它比较的两个字符串在字符串的开头和结尾是相同的,并且不同的中间部分是整数,则根据这些整数的数值进行比较.例如,我希望以下字符串以它们显示的顺序结束:
如您所见,字符串中可能还有其他整数,因此我不能只使用正则表达式来分解任何整数.我正在考虑从一开始就走绳子,直到找到一点不匹配,然后走到最后,直到找到一个不匹配的位,然后比较中间的位到正则表达式"[0-9] +",如果比较,则进行数值比较,否则进行词法比较.
有没有更好的办法?
更新我不认为我可以保证字符串中的其他数字,可能匹配的数字,周围没有空格,或者不同的数字确实有空格.
ScA*_*er2 98
来自网站
"人们对数字字符串的排序与软件不同.大多数排序算法都会比较ASCII值,这会产生与人类逻辑不一致的排序.以下是如何修复它."
编辑:这是从该站点到Java Comparator实现的链接.
Phi*_*Lho 12
有趣的小挑战,我很乐意解决它.
以下是我对这个问题的看法:
String[] strs =
{
"eee 5 ddd jpeg2001 eee",
"eee 123 ddd jpeg2000 eee",
"ddd",
"aaa 5 yy 6",
"ccc 555",
"bbb 3 ccc",
"bbb 9 a",
"",
"eee 4 ddd jpeg2001 eee",
"ccc 11",
"bbb 12 ccc",
"aaa 5 yy 22",
"aaa",
"eee 3 ddd jpeg2000 eee",
"ccc 5",
};
Pattern splitter = Pattern.compile("(\\d+|\\D+)");
public class InternalNumberComparator implements Comparator
{
public int compare(Object o1, Object o2)
{
// I deliberately use the Java 1.4 syntax,
// all this can be improved with 1.5's generics
String s1 = (String)o1, s2 = (String)o2;
// We split each string as runs of number/non-number strings
ArrayList sa1 = split(s1);
ArrayList sa2 = split(s2);
// Nothing or different structure
if (sa1.size() == 0 || sa1.size() != sa2.size())
{
// Just compare the original strings
return s1.compareTo(s2);
}
int i = 0;
String si1 = "";
String si2 = "";
// Compare beginning of string
for (; i < sa1.size(); i++)
{
si1 = (String)sa1.get(i);
si2 = (String)sa2.get(i);
if (!si1.equals(si2))
break; // Until we find a difference
}
// No difference found?
if (i == sa1.size())
return 0; // Same strings!
// Try to convert the different run of characters to number
int val1, val2;
try
{
val1 = Integer.parseInt(si1);
val2 = Integer.parseInt(si2);
}
catch (NumberFormatException e)
{
return s1.compareTo(s2); // Strings differ on a non-number
}
// Compare remainder of string
for (i++; i < sa1.size(); i++)
{
si1 = (String)sa1.get(i);
si2 = (String)sa2.get(i);
if (!si1.equals(si2))
{
return s1.compareTo(s2); // Strings differ
}
}
// Here, the strings differ only on a number
return val1 < val2 ? -1 : 1;
}
ArrayList split(String s)
{
ArrayList r = new ArrayList();
Matcher matcher = splitter.matcher(s);
while (matcher.find())
{
String m = matcher.group(1);
r.add(m);
}
return r;
}
}
Arrays.sort(strs, new InternalNumberComparator());
Run Code Online (Sandbox Code Playgroud)
这个算法需要更多的测试,但它似乎表现得相当不错.
[编辑]我添加了一些更清楚的评论.我看到有比我开始编码时更多的答案...但我希望我提供了一个良好的起点和/或一些想法.
小智 6
我在这里提出的实现简单而有效.它不会通过使用正则表达式或方法(如substring(),split(),toCharArray()等)直接或间接分配任何额外的内存.
此实现首先跨越两个字符串,以最大速度搜索不同的第一个字符,而不执行任何特殊处理.仅当这些字符都是数字时才触发特定数字比较.这种实现的副作用是数字被认为比其他字母大,与默认的词典顺序相反.
public static final int compareNatural (String s1, String s2)
{
// Skip all identical characters
int len1 = s1.length();
int len2 = s2.length();
int i;
char c1, c2;
for (i = 0, c1 = 0, c2 = 0; (i < len1) && (i < len2) && (c1 = s1.charAt(i)) == (c2 = s2.charAt(i)); i++);
// Check end of string
if (c1 == c2)
return(len1 - len2);
// Check digit in first string
if (Character.isDigit(c1))
{
// Check digit only in first string
if (!Character.isDigit(c2))
return(1);
// Scan all integer digits
int x1, x2;
for (x1 = i + 1; (x1 < len1) && Character.isDigit(s1.charAt(x1)); x1++);
for (x2 = i + 1; (x2 < len2) && Character.isDigit(s2.charAt(x2)); x2++);
// Longer integer wins, first digit otherwise
return(x2 == x1 ? c1 - c2 : x1 - x2);
}
// Check digit only in second string
if (Character.isDigit(c2))
return(-1);
// No digits
return(c1 - c2);
}
Run Code Online (Sandbox Code Playgroud)
我想出了一个使用正则表达式的非常简单的 Java 实现:
public static Comparator<String> naturalOrdering() {
final Pattern compile = Pattern.compile("(\\d+)|(\\D+)");
return (s1, s2) -> {
final Matcher matcher1 = compile.matcher(s1);
final Matcher matcher2 = compile.matcher(s2);
while (true) {
final boolean found1 = matcher1.find();
final boolean found2 = matcher2.find();
if (!found1 || !found2) {
return Boolean.compare(found1, found2);
} else if (!matcher1.group().equals(matcher2.group())) {
if (matcher1.group(1) == null || matcher2.group(1) == null) {
return matcher1.group().compareTo(matcher2.group());
} else {
return Integer.valueOf(matcher1.group(1)).compareTo(Integer.valueOf(matcher2.group(1)));
}
}
}
};
}
Run Code Online (Sandbox Code Playgroud)
下面是它的工作原理:
final List<String> strings = Arrays.asList("x15", "xa", "y16", "x2a", "y11", "z", "z5", "x2b", "z");
strings.sort(naturalOrdering());
System.out.println(strings);
Run Code Online (Sandbox Code Playgroud)
[x2a, x2b, x15, xa, y11, y16, z, z, z5]