搜索字符串中的Python逻辑

Question

搜索字符串中的Python逻辑

filtered=[]
text="any.pdf"
if "doc" and "pdf" and "xls" and "jpg" not in text:
    filtered.append(text)
print(filtered)

Run Code Online (Sandbox Code Playgroud)

这是我在Stack Overflow中的第一篇文章,所以如果在Question中有一些令人讨厌的东西,那么如果文本不包含任何这些单词,则代码会假设附加文本:doc,pdf,xls,jpg.它的工作正常如果它像:

if "doc" in text:
elif "jpg" in text:
elif "pdf" in text:
elif "xls" in text:
else:
    filtered.append(text)

Run Code Online (Sandbox Code Playgroud)

Answer 1

sen*_*rle 6

如果你打开一个python解释器,你会发现它"doc" and "pdf" and "xls" and "jpg"与以下内容相同'jpg':

>>> "doc" and "pdf" and "xls" and "jpg"
'jpg'

Run Code Online (Sandbox Code Playgroud)

因此,不是针对所有字符串进行测试,而是首次尝试仅针对"jpg"进行测试.

有很多方法可以做你想做的事.以下不是最明显的,但它很有用:

if not any(test_string in text for test_string in ["doc", "pdf", "xls", "jpg"]):
    filtered.append(text)

Run Code Online (Sandbox Code Playgroud)

另一种方法是将for循环与else语句结合使用:

for test_string in ["doc", "pdf", "xls", "jpg"]:
    if test_string in text:
        break
else: 
    filtered.append(text)

Run Code Online (Sandbox Code Playgroud)

最后,您可以使用纯列表推导:

tofilter = ["one.pdf", "two.txt", "three.jpg", "four.png"]
test_strings = ["doc", "pdf", "xls", "jpg"]
filtered = [s for s in tofilter if not any(t in s for t in test_strings)]

Run Code Online (Sandbox Code Playgroud)

编辑:

如果你想过滤单词和扩展名,我建议如下:

text_list = generate_text_list() # or whatever you do to get a text sequence
extensions = ['.doc', '.pdf', '.xls', '.jpg']
words = ['some', 'words', 'to', 'filter']
text_list = [text for text in text_list if not text.endswith(tuple(extensions))]
text_list = [text for text in text_list if not any(word in text for word in words)]

Run Code Online (Sandbox Code Playgroud)

这仍然可能导致一些不匹配; 上面还过滤了"做某事","他是一个文字匠"等等.如果这是一个问题,那么你可能需要一个更复杂的解决方案.

归档时间：	14 年，9 月前
查看次数：	786 次
最近记录：	14 年，9 月前