Python:如何在字符串中剪切超过2个相等字符的序列

Bar*_*art 6 python regex string

我正在寻找一种有效的方法来排除一个字符串,使得超过2个相同字符的所有序列在前2个之后被切断.

一些输入 - >输出示例是:

hellooooooooo -> helloo
woooohhooooo -> woohhoo
Run Code Online (Sandbox Code Playgroud)

我正在循环播放角色,但它有点慢.有没有人有其他解决方案(regexp或其他)

编辑:当前代码:

word_new = ""
        for i in range(0,len(word)-2):    
            if not word[i] == word[i+1] == word[i+2]:
                word_new = word_new+word[i]
        for i in range(len(word)-2,len(word)):
            word_new = word_new + word[i]
Run Code Online (Sandbox Code Playgroud)

bgp*_*ter 8

编辑:应用有用的评论后

import re

def ReplaceThreeOrMore(s):
    # pattern to look for three or more repetitions of any character, including
    # newlines.
    pattern = re.compile(r"(.)\1{2,}", re.DOTALL) 
    return pattern.sub(r"\1\1", s)
Run Code Online (Sandbox Code Playgroud)

(原始回复在这里) 尝试这样的事情:

import re

# look for a character followed by at least one repetition of itself.
pattern = re.compile(r"(\w)\1+")

# a function to perform the substitution we need:
def repl(matchObj):
   char = matchObj.group(1)
   return "%s%s" % (char, char)

>>> pattern.sub(repl, "Foooooooooootball")
'Football'
Run Code Online (Sandbox Code Playgroud)