连续重复词的正则表达式

String regex = "\\b(\\w+)(\\s+\\1\\b)*";
Pattern p = Pattern.compile(regex, Pattern.CASE_INSENSITIVE);

Matcher m = p.matcher(input);

// Check for subsequences of input that match the compiled pattern
while (m.find()) {
     input = input.replaceAll(m.group(0), m.group(1));
}

Run Code Online (Sandbox Code Playgroud)

示例输入：再见再见再见再见

示例输出：再见

解释：

正则表达式：

\b : 单词边界的开始

\w+ ：任意数量的单词字符

(\s+\1\b)* ：任意数量的空格后跟匹配前一个单词并结束单词边界的单词。用 * 包裹的整个东西有助于找到不止一个重复。

分组：

m.group(0) : 应包含上述情况下的匹配组 Goodbye goodbye GooDbYe

m.group(1) ：应包含上述情况下匹配模式的第一个单词再见

Replace 方法应将所有连续匹配的单词替换为单词的第一个实例。

Answer 4

Nik*_*hak 7

正则表达式去除 2+ 个重复词（连续/非连续词）

试试这个正则表达式，它可以捕获 2 个或更多重复的单词，并且只留下一个单词。并且重复的单词甚至不必是连续的。

/\b(\w+)\b(?=.*?\b\1\b)/ig

Run Code Online (Sandbox Code Playgroud)

这里，\b用于词边界，?=用于正向前瞻，\1用于反向引用。

示例源

@Walf 是的。然而，在某些情况下这是预期的。（例如：同时抓取数据） (3认同)

Answer 5

Faa*_*hir 6

尝试以下RE

\ b单词开始单词边界
\ W +任何字元
\ 1个相同的单词已经匹配
\ b字尾

（）*再次重复

public static void main(String[] args) {

    String regex = "\\b(\\w+)(\\b\\W+\\b\\1\\b)*";//  "/* Write a RegEx matching repeated words here. */";
    Pattern p = Pattern.compile(regex, Pattern.CASE_INSENSITIVE/* Insert the correct Pattern flag here.*/);

    Scanner in = new Scanner(System.in);

    int numSentences = Integer.parseInt(in.nextLine());

    while (numSentences-- > 0) {
        String input = in.nextLine();

        Matcher m = p.matcher(input);

        // Check for subsequences of input that match the compiled pattern
        while (m.find()) {
            input = input.replaceAll(m.group(0),m.group(1));
        }

        // Prints the modified sentence.
        System.out.println(input);
    }

    in.close();
}

Run Code Online (Sandbox Code Playgroud)

Answer 6

sou*_*rge 5

目前广泛使用的PCRE库可以处理这种情况(你不会达到的了与POSIX兼容的正则表达式引擎一样,虽然):

(\b\w+\b)\W+\1

Run Code Online (Sandbox Code Playgroud)

Answer 7

syn*_*kon 5

这是多次捕获多个单词的方法：

(\b\w+\b)(\s+\1)+

Run Code Online (Sandbox Code Playgroud)

归档时间：	15 年，7 月前
查看次数：	75730 次
最近记录：	6 年，1 月前