k1e*_*ran 4 java regex string tokenize guava
试图从字符串中有效地提取一些数字并尝试过
结果是:
还有另一种更快的推荐方式吗?
我知道之前提出的类似问题,例如如何从Java中的String中提取多个整数?但我的重点在于快速(但可维护/简单),因为它发生了很多.
编辑:以下是我的最终结果,与下面的Andrea Ligios相关:
import org.junit.Test;
import com.google.common.base.CharMatcher;
import com.google.common.base.Splitter;
import java.util.ArrayList;
import java.util.Iterator;
import java.util.List;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Sample {
final static int COUNT = 50000000;
public static final String INPUT = "FOO-1-9-BAR1"; // I want 1, 9, 1
@Test
public void extractNumbers() {
long startTime = System.currentTimeMillis();
for (int i = 0; i < COUNT; i++) {
// Output is list of 1, 9, 1
Demo.extractNumbersViaGoogleSplitter(INPUT);
}
System.out.println("Total execution time (ms) via Google Splitter: " + (System.currentTimeMillis() - startTime));
startTime = System.currentTimeMillis();
for (int i = 0; i < COUNT; i++) {
// Output is list of 1, 9, 1
Demo.extractNumbersViaRegEx(INPUT);
}
System.out.println("Total execution time (ms) Regular Expression: " + (System.currentTimeMillis() - startTime));
}
}
class Demo {
static List<Integer> extractNumbersViaGoogleSplitter(final String text) {
Iterator<String> iter = Splitter.on(CharMatcher.JAVA_DIGIT.negate()).trimResults().omitEmptyStrings().split(text).iterator();
final List<Integer> result = new ArrayList<Integer>();
while (iter.hasNext()) {
result.add(Integer.parseInt(iter.next()));
}
return result;
}
/**
* Matches all the numbers in a string, as individual groups. e.g.
* FOO-1-BAR1-1-12 matches 1,1,1,12.
*/
private static final Pattern NUMBERS = Pattern.compile("(\\d+)");
static List<Integer> extractNumbersViaRegEx(final String source) {
final Matcher matcher = NUMBERS.matcher(source);
final List<Integer> result = new ArrayList<Integer>();
if (matcher.find()) {
do {
result.add(Integer.parseInt(matcher.group(0)));
} while (matcher.find());
return result;
}
return result;
}
}
Run Code Online (Sandbox Code Playgroud)
这是一个非常快速的算法:
public List<Integer> extractIntegers(String input)
{
List<Integer> result = new ArrayList<Integer>();
int index = 0;
int v = 0;
int l = 0;
while (index < input.length())
{
char c = input.charAt(index);
if (Character.isDigit(c))
{
v *= 10;
v += c - '0';
l++;
} else if (l > 0)
{
result.add(v);
l = 0;
v = 0;
}
index++;
}
if (l > 0)
{
result.add(v);
}
return result;
}
Run Code Online (Sandbox Code Playgroud)
这段代码在我的机器上花了3672毫秒,用于"FOO-1-9-BAR1"和50000000次运行.我使用的是2.3 GHz核心.