我随机存储了文字$sentences.使用正则表达式,我想将文本拆分成句子,请参阅:
function splitSentences($text) {
$re = '/ # Split sentences on whitespace between them.
(?<= # Begin positive lookbehind.
[.!?] # Either an end of sentence punct,
| [.!?][\'"] # or end of sentence punct and quote.
) # End positive lookbehind.
(?<! # Begin negative lookbehind.
Mr\. # Skip either "Mr."
| Mrs\. # or "Mrs.",
| T\.V\.A\. # or "T.V.A.",
# or... (you get the idea).
) # End negative lookbehind.
\s+ # Split on whitespace …Run Code Online (Sandbox Code Playgroud) 所以这是字符串s:
"Hi! How are you? I'm fine. It is 6 p.m. Thank you! That's it."
Run Code Online (Sandbox Code Playgroud)
我希望它们被分隔为一个数组:
["Hi", "How are you", "I'm fine", "It is 6 p.m", "Thank you", "That's it"]
Run Code Online (Sandbox Code Playgroud)
这意味着分隔符应为". "+ "? "+"! "
我试过了:
let charSet = NSCharacterSet(charactersInString: ".?!")
let array = s.componentsSeparatedByCharactersInSet(charSet)
Run Code Online (Sandbox Code Playgroud)
但它也将分为p.m.两个元素.结果:
["Hi", " How are you", " I'm fine", " It is 6 p", "m", " Thank you", " That's it"]
Run Code Online (Sandbox Code Playgroud)
我也试过了
let array = s.componentsSeparatedByString(". ") …Run Code Online (Sandbox Code Playgroud)