如何检查整数中的重复序列

noM*_*MAD 24 java regex algorithm

我有一个字母数字字符串,我想检查它中的模式重复只是为了整数.它们应该是连续的.

  1. 12341234q我们应该告诉我重复1234.
  2. 1234qwe1234应该告诉我,1234,因为它不是连续重复.
  3. 12121212应该被视为重复12,因为这是第一个被发现重复的集合.但是如果有一个算法会在12之前找到1212作为重复集合,那么我猜它必须在1212再次执行这些步骤.

我的想法是,我可以通过循环并将其与比较整数部分存储( <= '0' && >= '9')在不同的StringBuilder.然后我读到关于对字符串执行FFT并显示重复模式.但是我不知道如何在Java中执行FFT并查找结果,我也希望在不进行信号处理的情况下尝试这样做.我读到了关于KMP模式匹配但只适用于给定输入.有没有其他方法可以做到这一点?

anu*_*ava 55

你可以借助正则表达式来解决这个问题.考虑这样的代码:

String arr[] = {"12341234abc", "1234foo1234", "12121212", "111111111", "1a1212b123123c12341234d1234512345"};
String regex = "(\\d+?)\\1";
Pattern p = Pattern.compile(regex);
for (String elem : arr) {
    boolean noMatchFound = true;
    Matcher matcher = p.matcher(elem);
    while (matcher.find()) {
        noMatchFound = false;
        System.out.println(elem + " got repeated: " + matcher.group(1));
    }
    if (noMatchFound) {
        System.out.println(elem + " has no repeation");
    }
}
Run Code Online (Sandbox Code Playgroud)

OUTPUT:

abc12341234abc got repeated: 1234
1234foo1234 has no repeation
12121212 got repeated: 12
12121212 got repeated: 12
111111111 got repeated: 1
111111111 got repeated: 1
111111111 got repeated: 1
111111111 got repeated: 1
1a1212b123123c12341234d1234512345 got repeated: 12
1a1212b123123c12341234d1234512345 got repeated: 123
1a1212b123123c12341234d1234512345 got repeated: 1234
1a1212b123123c12341234d1234512345 got repeated: 12345
Run Code Online (Sandbox Code Playgroud)

说明:

使用正则表达式是(\\d+?)\\1在哪里

\\d        - means a numerical digit
\\d+       - means 1 or more occurrences of a digit
\\d+?      - means reluctant (non-greedy) match of 1 OR more digits
( and )    - to group the above regex into group # 1
\\1        - means back reference to group # 1
(\\d+?)\\1 - repeat the group # 1 immediately after group # 1
Run Code Online (Sandbox Code Playgroud)

  • 好的,如果可以,我会给你+2 (2认同)

小智 7

我不确定您是否熟悉RegularExpressions(RegEx),但此代码有效

String str = "12341234qwe";
String rep = str.replaceAll(".*(.+)\\1.*","$1");
if (rep.equals(str))
    System.out.println(str+" has no repition");
else
    System.out.println(str+" has repition "+rep);
str = "1234qwe1234";
rep = str.replaceAll(".*(.+)\\1.*","$1");
if (rep.equals(str))
    System.out.println(str+" has no repition");
else
    System.out.println(str+" has repition "+rep);
Run Code Online (Sandbox Code Playgroud)

这是教程:http://docs.oracle.com/javase/tutorial/essential/regex/


Tud*_*dor 6

我的理论是你可以使用称为后缀树的数据结构来实现你想要的.

浏览初始字符串,收集每个连续的数字序列并构建其后缀树.对于您的示例,它看起来像(对于前4个后缀):

                  R - root
      |         |          |         |
      |         |          |         |
      |         |          |         | 
  12341234$  2341234$   341234$     41234$
Run Code Online (Sandbox Code Playgroud)

现在,下一个后缀依次为1234 $.但是,在插入时,我们注意到它与第一个后缀的前缀1234匹配.计数器保持并行,并在每次向树添加后缀时递增.

在每一步中,我们将计数器与要插入的当前后缀和与之匹配的子字符串之间的匹配长度进行比较.如果匹配的长度是计数器的倍数,那么我们有重复.

在上面的例子中,当我们插入1234 $时,计数器将是4(从0开始),并且前缀为12341234 $的匹配长度也是4,因此重复1234.