Chr*_*ann 5 regex backreference r
我有这样的文本字符串:
u <- "she goes ~Wha::?~ and he's like ~?Yeah believe me!~ and she's etc."
Run Code Online (Sandbox Code Playgroud)
我想要做的是将成对~分隔符(包括分隔符本身)之间出现的所有字符替换为X.
此gsub方法用~单个替换-delimitor 对之间的子字符串X:
gsub("~[^~]+~", "X", u)
[1] "she goes X and he's like X and she's etc."
Run Code Online (Sandbox Code Playgroud)
但是,我真正想做的是将分隔符(和分隔符本身)之间的每个字符替换为X. 所需的输出是这样的:
"she goes XXXXXXXXX and he's like XXXXXXXXXXXXXXXXXXX and she's etc."
Run Code Online (Sandbox Code Playgroud)
我一直在试验nchar,反向引用,paste如下,但结果不正确:
gsub("(~[^~]+~)", paste0("X{", nchar("\\1"),"}"), u)
[1] "she goes X{2} and he's like X{2} and she's etc."
Run Code Online (Sandbox Code Playgroud)
任何帮助表示赞赏。
该paste0("X{", nchar("\\1"),"}")代码结果X{2},因为"\\1" 是长度为2的字符串\1不插为反向引用,如果你不字符串模式使用它。
您可以使用以下解决方案stringr:
> u <- "she goes ~Wha::?~ and he's like ~?Yeah believe me!~ and she's etc."
> str_replace_all(u, '~[^~]+~', function(x) str_dup("X", nchar(x)))
[1] "she goes XXXXXXXX and he's like XXXXXXXXXXXXXXXXXXX and she's etc."
Run Code Online (Sandbox Code Playgroud)
找到与 匹配后~[^~]+~,该值将传递给匿名函数,并从中str_dup创建一个X与匹配值长度相同的字符串。