如何根据R中分隔符之间的出现替换字符串中的确切字符数

Chr*_*ann 5 regex backreference r

我有这样的文本字符串:

u <- "she goes ~Wha::?~ and he's like ~?Yeah believe me!~ and she's etc."
Run Code Online (Sandbox Code Playgroud)

我想要做的是将成对~分隔符(包括分隔符本身)之间出现的所有字符替换为X.

gsub方法用~单个替换-delimitor 对之间的子字符串X

gsub("~[^~]+~", "X", u)
[1] "she goes X and he's like X and she's etc."
Run Code Online (Sandbox Code Playgroud)

但是,我真正想做的是将分隔符(和分隔符本身)之间的每个字符替换为X. 所需的输出是这样的:

"she goes XXXXXXXXX and he's like XXXXXXXXXXXXXXXXXXX and she's etc."
Run Code Online (Sandbox Code Playgroud)

我一直在试验nchar,反向引用,paste如下,但结果不正确:

gsub("(~[^~]+~)", paste0("X{", nchar("\\1"),"}"), u)
[1] "she goes X{2} and he's like X{2} and she's etc."
Run Code Online (Sandbox Code Playgroud)

任何帮助表示赞赏。

Wik*_*żew 5

paste0("X{", nchar("\\1"),"}")代码结果X{2},因为"\\1" 是长度为2的字符串\1不插为反向引用,如果你不字符串模式使用它。

您可以使用以下解决方案stringr

> u <- "she goes ~Wha::?~ and he's like ~?Yeah believe me!~ and she's etc."
> str_replace_all(u, '~[^~]+~', function(x) str_dup("X", nchar(x)))
[1] "she goes XXXXXXXX and he's like XXXXXXXXXXXXXXXXXXX and she's etc."
Run Code Online (Sandbox Code Playgroud)

找到与 匹配后~[^~]+~,该值将传递给匿名函数,并从中str_dup创建一个X与匹配值长度相同的字符串。