递归拆分字符串

Dan*_*ian 0 regex string r mapreduce

说我有这样的文字:

pattern = "This_is some word/expression I'd like to parse:intelligently(using special symbols-like '.')"
Run Code Online (Sandbox Code Playgroud)

挑战在于如何使用单词分隔符将其拆分为单词

c(" ","-","/","\\","_",":","(",")",".",",")
Run Code Online (Sandbox Code Playgroud)

家庭.

期望的结果:

"This" "is" "some" "word" "expression" "I'd" "like" "to" "parse" "intelligently" "using" "special" "symbols" "like"
Run Code Online (Sandbox Code Playgroud)

方法:

我可以做sapplyfor循环使用:

 keywords = unlist(strsplit(pattern," "))
 keywords = unlist(strsplit(keywords,"-"))
Run Code Online (Sandbox Code Playgroud)

#等

题:

但是使用什么解决方案Reduce(f, x, init, accummulate=TRUE)

A5C*_*2T1 5

你不应该Reduce在这里需要.您应该可以执行以下操作:

splitters <- c(" ","/","\\","_",":","(",")",".",",","-") # dash should come last
pattern <- paste0("[", paste(splitters, collapse = ""), "]")
string <- "This_is some word/expression I'd like to parse:intelligently(using special symbols-like '.')"
strsplit(string, pattern)[[1]]
#  [1] "This"          "is"            "some"          "word"         
#  [5] "expression"    "I'd"           "like"          "to"           
#  [9] "parse"         "intelligently" "using"         "special"      
# [13] "symbols"       "like"          "'"             "'"  
Run Code Online (Sandbox Code Playgroud)

请注意,-正则表达式字符类中的a应该是第一个或最后一个,所以我已经相应地编辑了"分割器"的向量.此外,您可能希望+在"模式"的末尾添加一个,以防您想要折叠多个空格.