使用 python 提取包含关键字或短语列表的句子

Wah*_*adh 2 python search text file

我使用以下代码从文件中提取句子(该句子应包含部分或全部搜索关键字)

search_keywords=['mother','sing','song']
with open('text.txt', 'r') as in_file:
    text = in_file.read()
    sentences = text.split(".")

for sentence in sentences:
    if (all(map(lambda word: word in sentence, search_keywords))):
        print sentence
Run Code Online (Sandbox Code Playgroud)

上述代码的问题是,如果搜索关键字之一与句子单词不匹配,它不会打印所需的句子。我想要一个代码来打印包含部分或全部搜索关键字的句子。如果代码还可以搜索短语并提取相应的句子,那就太好了。

Chr*_*nds 5

看起来你想数一下search_keyboards每个句子中的数量。您可以按如下方式执行此操作:

sentences = "My name is sing song. I am a mother. I am happy. You sing like my mother".split(".")
search_keywords=['mother','sing','song']

for sentence in sentences:
    print("{} key words in sentence:".format(sum(1 for word in search_keywords if word in sentence)))
    print(sentence + "\n")

# Outputs:
#2 key words in sentence:
#My name is sing song
#
#1 key words in sentence:
# I am a mother
#
#0 key words in sentence:
# I am happy
#
#2 key words in sentence:
# You sing like my mother
Run Code Online (Sandbox Code Playgroud)

或者,如果您只想要最匹配的句子search_keywords,您可以制作一个字典并找到最大值:

dct = {}
for sentence in sentences:
    dct[sentence] = sum(1 for word in search_keywords if word in sentence)

best_sentences = [key for key,value in dct.items() if value == max(dct.values())]


print("\n".join(best_sentences))

# Outputs:
#My name is sing song
# You sing like my mother
Run Code Online (Sandbox Code Playgroud)