小编use*_*140的帖子

在R中拆分句子,其中不需要拆分电子邮件ID或十进制数

我想通过句号或句号将段落分成句子.但在执行此操作时,十进制数字,电子邮件ID也会分成不同的数据帧.任何人都可以帮助我将数据拆分成句子.

例如:

aa = "For Important Disclosure information, please visit our website at 0.5%  https://javatar.bluematrix.com/sellside/Disclosures.action or call 1.888.JEFFERIES. An organization. 0.5% have an analysis."
Run Code Online (Sandbox Code Playgroud)

这应该分成

  1. For Important Disclosure information, please visit our website at 0.5% https://javatar.bluematrix.com/sellside/Disclosures.action or call 1.888.JEFFERIES.
  2. An organization.
  3. 0.5% have an analysis

码:

sentences = as.matrix(unlist(strsplit(aa,"\\.")))
Run Code Online (Sandbox Code Playgroud)

regex string split r strsplit

1
推荐指数
1
解决办法
49
查看次数

标签 统计

r ×1

regex ×1

split ×1

string ×1

strsplit ×1