替换字符串中的重复字符串

Nin*_*Nin 5 php regex perl string-search

我正在尝试在字符串中查找(并替换)重复的字符串.

我的字符串可能如下所示:

Lorem ipsum dolor sit amet sit amet sit amet sit nostrud exercitation amit sit ullamco laboris nisi ut aliquip ex ea commodo consequat.

这应该成为:

Lorem ipsum dolor sit amet sit nostrud exercitation amit sit ullamco laboris nisi ut aliquip ex ea commodo consequat.

请注意,由于没有重复,因此不会删除amit sit.

或者字符串可以是这样的:

Lorem ipsum dolor sit amet()sit amet()sit amet()sit nostrud exercitation ullamco laboris nisi ut aliquip aliquip ex ea commodo consequat.

应成为:

Lorem ipsum dolor sit amet()sit nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.

所以它不仅仅是az,还可以有其他(ascii)字符.如果有人可以帮助我,我很高兴.

下一步是匹配(和替换)这样的事情:

2个问题3个问题4个问题5个问题

这会成为:

2个问题

最终输出中的数字可以是任意数字2,3,4,这没关系.在最后的例子中只会有不同的数字,但单词将是相同的.

Nox*_*oxt 0

第一个任务解决方案代码:

<?php

    function split_repeating($string)
    {
        $words = explode(' ', $string);
        $words_count = count($words);

        $need_remove = array();
        for ($i = 0; $i < $words_count; $i++) {
            $need_remove[$i] = false;
        }

        // Here I iterate through the number of words that will be repeated and check all the possible positions reps
        for ($i = round($words_count / 2); $i >= 1; $i--) {
            for ($j = 0; $j < ($words_count - $i); $j++) {
                $need_remove_item = !$need_remove[$j];
                for ($k = $j; $k < ($j + $i); $k++) {
                    if ($words[$k] != $words[$k + $i]) {
                        $need_remove_item = false;
                        break;
                    }
                }
                if ($need_remove_item) {
                    for ($k = $j; $k < ($j + $i); $k++) {
                        $need_remove[$k] = true;
                    }
                }
            }
        }

        $result_string = '';
        for ($i = 0; $i < $words_count; $i++) {
            if (!$need_remove[$i]) {
                $result_string .= ' ' . $words[$i];
            }
        }
        return trim($result_string);
    }



    $string = 'Lorem ipsum dolor sit amet sit amet sit amet sit nostrud exercitation amit sit ullamco laboris nisi ut aliquip ex ea commodo consequat.';

    echo $string . '<br>';
    echo split_repeating($string) . '<br>';
    echo 'Lorem ipsum dolor sit amet sit nostrud exercitation amit sit ullamco laboris nisi ut aliquip ex ea commodo consequat.' . '<br>' . '<br>';



    $string = 'Lorem ipsum dolor sit amet () sit amet () sit amet () sit nostrud exercitation ullamco laboris nisi ut aliquip aliquip ex ea commodo consequat.';

    echo $string . '<br>';
    echo split_repeating($string) . '<br>';
    echo 'Lorem ipsum dolor sit amet () sit nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.';

?>
Run Code Online (Sandbox Code Playgroud)

第二个任务解决代码:

<?php

    function split_repeating($string)
    {
        $words = explode(' ', $string);
        $words_count = count($words);

        $need_remove = array();
        for ($i = 0; $i < $words_count; $i++) {
            $need_remove[$i] = false;
        }

        for ($j = 0; $j < ($words_count - 1); $j++) {
            $need_remove_item = !$need_remove[$j];
            for ($k = $j + 1; $k < ($words_count - 1); $k += 2) {
                if ($words[$k] != $words[$k + 2]) {
                    $need_remove_item = false;
                    break;
                }
            }
            if ($need_remove_item) {
                for ($k = $j + 2; $k < $words_count; $k++) {
                    $need_remove[$k] = true;
                }
            }
        }

        $result_string = '';
        for ($i = 0; $i < $words_count; $i++) {
            if (!$need_remove[$i]) {
                $result_string .= ' ' . $words[$i];
            }
        }
        return trim($result_string);
    }



    $string = '2 questions 3 questions 4 questions 5 questions';

    echo $string . '<br>';
    echo split_repeating($string) . '<br>';
    echo '2 questions';

?>
Run Code Online (Sandbox Code Playgroud)