通过结束字符拆分句子

Mik*_*.K. 1 python arrays string list

最近的一个项目让我需要将传入的短语(作为字符串)分成组成句子.例如,这个字符串:

"Your mother was a hamster, and your father smelt of elderberries! Now go away, or I shall taunt you a second time. You know what, never mind. This entire sentence is far too silly. Wouldn't you agree? I think it is."

需要将其转换为由以下元素组成的列表:

["Your mother was a hamster, and your father smelt of elderberries",
"Now go away, or I shall taunt you a second time",
"You know what, never mind",
"This entire sentence is far too silly",
"Wouldn't you agree",
"I think it is"]
Run Code Online (Sandbox Code Playgroud)

出于此函数的目的,"句子"是由,,或注意到的字符串!,应从输出中删除标点符号,如上所示.?.

我有一个工作版本,但它很丑,留下前导和尾随空格,我不禁想到有更好的方法:

from functools import reduce

def split_sentences(st):
  if type(st) is not str:
    raise TypeError("Cannot split non-strings")
  sl = st.split('.')
  sl = [s.split('?') for s in sl]
  sl = reduce(lambda x, y: x+y, sl) #Flatten the list
  sl = [s.split('!') for s in sl]
  return reduce(lambda x, y: x+y, sl)
Run Code Online (Sandbox Code Playgroud)

che*_*ner 9

re.split而是使用指定匹配任何句子结尾字符(以及任何后续空格)的正则表达式.

def split_sentences(st):
    sentences = re.split(r'[.?!]\s*', st)
    if sentences[-1]:
        return sentences
    else:
        return sentences[:-1]
Run Code Online (Sandbox Code Playgroud)