从字符串中删除多个单词的更好方法是什么?

And*_*ong 5 python regex string replace python-3.x

bannedWord = ['Good','Bad','Ugly']

def RemoveBannedWords(toPrint,database):
    statement = toPrint
    for x in range(0,len(database)):
        if bannedWord[x] in statement:
            statement = statement.replace(bannedWord[x]+' ','')
    return statement

toPrint = 'Hello Ugly Guy, Good To See You.'

print RemoveBannedWords(toPrint,bannedWord)
Run Code Online (Sandbox Code Playgroud)

输出是Hello Guy, To See You.了解Python我觉得有更好的方法来实现更改字符串中的几个单词.我使用字典搜索了一些类似的解决方案,但它似乎不适合这种情况.

Shr*_*han 11

我用

bannedWord = ['Good','Bad','Ugly']
toPrint = 'Hello Ugly Guy, Good To See You.'
print ' '.join(i for i in toPrint.split() if i not in bannedWord)
Run Code Online (Sandbox Code Playgroud)


Aja*_*pta 7

这是一个正则表达式的解决方案:

import re

def RemoveBannedWords(toPrint,database):
    statement = toPrint
    pattern = re.compile("\\b(Good|Bad|Ugly)\\W", re.I)
    return pattern.sub("", toPrint)

toPrint = 'Hello Ugly Guy, Good To See You.'

print RemoveBannedWords(toPrint,bannedWord)
Run Code Online (Sandbox Code Playgroud)

  • 你不再真正使用`bannedWord`,所以你最好摆脱它 (2认同)
  • `re.compile(r"\ b("+"|".join(database)+")\\ W",re.I)` (2认同)

Ita*_*chi 5

Ajay 的代码略有变化,当其中一个字符串是禁词列表中另一个字符串的子字符串时

bannedWord = ['good', 'bad', 'good guy' 'ugly']
Run Code Online (Sandbox Code Playgroud)

结果toPrint ='good winter good guy'

RemoveBannedWords(toPrint,database = bannedWord) = 'winter good'
Run Code Online (Sandbox Code Playgroud)

因为它会good先删除。需要对列表中元素的长度进行排序。

import re

def RemoveBannedWords(toPrint,database):
    statement = toPrint
    database_1 = sorted(list(database), key=len)
    pattern = re.compile(r"\b(" + "|".join(database_1) + ")\\W", re.I)
    return pattern.sub("", toPrint + ' ')[:-1] #added because it skipped last word

toPrint = 'good winter good guy.'

print(RemoveBannedWords(toPrint,bannedWord))
Run Code Online (Sandbox Code Playgroud)