我有以下种类的字符串:
A B C Company
XYZ Inc
S & K Co
Run Code Online (Sandbox Code Playgroud)
我想删除这些字符串中仅在1个字母长度的单词之间的空格.例如,第一个字符串中我想删除之间的空隙A
B
和C
,但不会之间C
和公司.结果应该是:
ABC Company
XYZ Inc
S&K Co
Run Code Online (Sandbox Code Playgroud)
为此使用正确的正则表达式是什么gsub
?
hwn*_*wnd 19
这是你可以做到这一点的一种方式,看看如何&
混合而不是单词字符......
x <- c('A B C Company', 'XYZ Inc', 'S & K Co', 'A B C D E F G Company')
gsub('(?<!\\S\\S)\\s+(?=\\S(?!\\S))', '', x, perl=TRUE)
# [1] "ABC Company" "XYZ Inc" "S&K Co" "ABCDEFG Company"
Run Code Online (Sandbox Code Playgroud)
说明:
首先,我们断言两个非空白字符不会背靠背.然后我们寻找并匹配空格 "一次或多次".接下来,我们预见断言在断言下一个字符不是非空白字符时会跟随非空白字符.
(?<! # look behind to see if there is not:
\S # non-whitespace (all but \n, \r, \t, \f, and " ")
\S # non-whitespace (all but \n, \r, \t, \f, and " ")
) # end of look-behind
\s+ # whitespace (\n, \r, \t, \f, and " ") (1 or more times)
(?= # look ahead to see if there is:
\S # non-whitespace (all but \n, \r, \t, \f, and " ")
(?! # look ahead to see if there is not:
\S # non-whitespace (all but \n, \r, \t, \f, and " ")
) # end of look-ahead
) # end of look-ahead
Run Code Online (Sandbox Code Playgroud)
Ric*_*ven 10
必要strsplit
/ paste
答案.这也将获得可能位于字符串中间或末尾的单个字符.
x <- c('A B C Company', 'XYZ Inc', 'S & K Co',
'A B C D E F G Company', 'Company A B C', 'Co A B C mpany')
foo <- function(x) {
x[nchar(x) == 1L] <- paste(x[nchar(x) == 1L], collapse = "")
paste(unique(x), collapse = " ")
}
vapply(strsplit(x, " "), foo, character(1L))
# [1] "ABC Company" "XYZ Inc" "S&K Co"
# [4] "ABCDEFG Company" "Company ABC" "Co ABC mpany"
Run Code Online (Sandbox Code Playgroud)
比赛迟到,但这种模式适合你
(?<!\\S\\S)\\s+(?!\\S\\S)
Run Code Online (Sandbox Code Playgroud)