递归拆分包含一组已定义前缀的字符串 - Python

alv*_*vas 6 python string recursion split prefix

如果我有一个可以附加到字符串的前缀列表,我如何将一个字符串拆分为它的前缀和下一个子字符串中的其他字符.例如:

prefixes = ['over','under','re','un','co']

str1 = "overachieve"
output: ["over","achieve"]

str2 = "reundo"
output = ["re","un","do"]
Run Code Online (Sandbox Code Playgroud)

是否有更好的方法来执行上述任务,可能使用正则表达式或一些字符串函数,而不是:

str1 = "reundo"
output = []

for x in [p for p in prefixes if p in str1]:
    output.append(x)    
    str1 =  str1.replace(x,"",1)
output.append(str1)
Run Code Online (Sandbox Code Playgroud)

Ray*_*ger 5

正则表达式是搜索许多替代前缀的有效方法:

import re

def split_prefixes(word, prefixes):
    regex = re.compile('|'.join(sorted(prefixes, key=len, reverse=True)))
    result = []
    i = 0
    while True:
        mo = regex.match(word, i)
        if mo is None:
            result.append(word[i:])
            return result
        result.append(mo.group())
        i = mo.end()


>>> prefixes = ['over', 'under', 're', 'un', 'co']
>>> for word in ['overachieve', 'reundo', 'empire', 'coprocessor']:
        print word, '-->', split_prefixes(word, prefixes)

overachieve --> ['over', 'achieve']
reundo --> ['re', 'un', 'do']
empire --> ['empire']
coprocessor --> ['co', 'processor']
Run Code Online (Sandbox Code Playgroud)

  • +1匹配'match(word,i)`,我从未注意到`match`也有`pos`和`endpos`. (2认同)