Dav*_*rry 10 java regex pattern-matching
使用Java和Regex解析随机字符串查找重复序列.
考虑字符串:
aaabbaaacccbb
我想找到一个正则表达式,它将在上面的字符串中找到所有匹配项:
aaabbaaacccbb
^^^ ^^^
aaabbaaacccbb
^^ ^^
Run Code Online (Sandbox Code Playgroud)
什么是正则表达式,它将检查字符串是否有任何重复的字符序列,并返回那些重复字符的组,使得组1 = aaa和组2 = bb.另请注意,我使用了一个示例字符串,但任何重复的字符都是有效的:RonRonJoeJoe ......,,,,, ,,
这样做:
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Test {
public static void main(String[] args) {
String s = "aaabbaaacccbb";
find(s);
String s1 = "RonRonRonJoeJoe .... ,,,,";
find(s1);
System.err.println("---");
String s2 = "RonBobRonJoe";
find(s2);
}
private static void find(String s) {
Matcher m = Pattern.compile("(.+)\\1+").matcher(s);
while (m.find()) {
System.err.println(m.group());
}
}
}
Run Code Online (Sandbox Code Playgroud)
OUTPUT:
aaa
bb
aaa
ccc
bb
RonRonRon
JoeJoe
....
,,,,
---
Run Code Online (Sandbox Code Playgroud)
您可以使用positive lookahead 基于此的正则表达式:
((\\w)\\2+)(?=.*\\1)
Run Code Online (Sandbox Code Playgroud)
String elem = "aaabbaaacccbb";
String regex = "((\\w)\\2+)(?=.*\\1)";
Pattern p = Pattern.compile(regex);
Matcher matcher = p.matcher(elem);
for (int i=1; matcher.find(); i++)
System.out.println("Group # " + i + " got: " + matcher.group(1));
Run Code Online (Sandbox Code Playgroud)
Group # 1 got: aaa
Group # 2 got: bb
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
13270 次 |
| 最近记录: |