如何使用Java Regex查找字符串中的所有重复字符序列?

Dav*_*rry 10 java regex pattern-matching

使用Java和Regex解析随机字符串查找重复序列.

考虑字符串:

aaabbaaacccbb

我想找到一个正则表达式,它将在上面的字符串中找到所有匹配项:

aaabbaaacccbb
^^^  ^^^

aaabbaaacccbb
   ^^      ^^
Run Code Online (Sandbox Code Playgroud)

什么是正则表达式,它将检查字符串是否有任何重复的字符序列,并返回那些重复字符的组,使得组1 = aaa和组2 = bb.另请注意,我使用了一个示例字符串,但任何重复的字符都是有效的:RonRonJoeJoe ......,,,,, ,,

Gui*_*let 9

这样做:

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class Test {
    public static void main(String[] args) {
        String s = "aaabbaaacccbb";
        find(s);
        String s1 = "RonRonRonJoeJoe .... ,,,,";
        find(s1);
        System.err.println("---");
        String s2 = "RonBobRonJoe";
        find(s2);
    }

    private static void find(String s) {
        Matcher m = Pattern.compile("(.+)\\1+").matcher(s);
        while (m.find()) {
            System.err.println(m.group());
        }
    }
}
Run Code Online (Sandbox Code Playgroud)

OUTPUT:

aaa
bb
aaa
ccc
bb
RonRonRon
JoeJoe
....
,,,,
---
Run Code Online (Sandbox Code Playgroud)


anu*_*ava 2

您可以使用positive lookahead 基于此的正则表达式:

((\\w)\\2+)(?=.*\\1)
Run Code Online (Sandbox Code Playgroud)

代码:

String elem = "aaabbaaacccbb";
String regex = "((\\w)\\2+)(?=.*\\1)";
Pattern p = Pattern.compile(regex);
Matcher matcher = p.matcher(elem);
for (int i=1; matcher.find(); i++)
System.out.println("Group # " + i + " got: " + matcher.group(1));
Run Code Online (Sandbox Code Playgroud)

输出:

Group # 1 got: aaa
Group # 2 got: bb
Run Code Online (Sandbox Code Playgroud)