正则表达式; 消除所有标点符号除外

Tyl*_*ker 7 r strsplit

我有以下正则表达式分裂任何空格或标点符号.如何从中排除1个或多个标点字符:punct:?假设我想排除撇号和逗号.我知道我可以明确使用[all punctuation marks in here]而不是,[[:punct:]]但我希望有一个排除方法.

X <- "I'm not that good at regex yet, but am getting better!"
strsplit(X, "[[:space:]]|(?=[[:punct:]])", perl=TRUE)

 [1] "I"       "'"       "m"       "not"     "that"    "good"    "at"      "regex"   "yet"    
[10] ","       ""        "but"     "am"      "getting" "better"  "!"
Run Code Online (Sandbox Code Playgroud)

Jos*_*ich 8

我不清楚你想要的结果是什么,但是你可以使用像这样的答案的负面类.

R> strsplit(X, "[[:space:]]|(?=[^,'[:^punct:]])", perl=TRUE)[[1]]
 [1] "I'm"     "not"     "that"    "good"    "at"      "regex"   "yet,"   
 [8] "but"     "am"      "getting" "better"  "!"    
Run Code Online (Sandbox Code Playgroud)