我有一个很长的字符串,我想删除大写的连续单词(连续 2 个以上),如果最后一个大写单词后面有标点符号,那也是。但与此同时,我想保留作为“混合”词一部分的单个大写单词和大写单词(参见 reprex)。
我很难在 reprex 中实现连续词组。
string <- "Lorem ipsum DOLOR SIT AMET? consectetuer adipiscing elit. Morbi gravida libero NEC velit. Morbi scelerisque luctus velit. ETIAM-123 dui sem, fermentum vitae, SAGITTIS ID? malesuada in, quam. Proin mattis lacinia justo. Vestibulum facilisis auctor urna. Aliquam IN LOREM SIT amet leo accumsan"
#remove all consecutive UPPERCASE words including punctation (--> DOLOR SIT AMET?), but not single uppercase words (--> NEC) or "mixed" words with uppercase and digits (--> ETIAM-123)
#this doesn't …Run Code Online (Sandbox Code Playgroud)