从数字部分字符串中删除逗号

Tyl*_*ker 7 regex r

我怎样才能(最快的优先)从字符串的数字部分删除逗号而不影响字符串中其余的逗号.所以在下面的例子中我想删除数字部分的逗号,但是狗应该保留逗号(是的,我知道1023455中的逗号是错误的,但只是抛出一个角落的情况).

是)我有的:

x <- "I want to see 102,345,5 dogs, but not too soo; it's 3,242 minutes away"
Run Code Online (Sandbox Code Playgroud)

期望的结果:

[1] "I want to see 1023455 dogs, but not too soo; it's 3242 minutes away"
Run Code Online (Sandbox Code Playgroud)

规定:必须在基础上完成,不添加包装.

先感谢您.

编辑: 谢谢Dason,Greg和Dirk.你的反应都很好.我正在玩一些接近Dason的回应,但在括号内有逗号.现在看它甚至没有意义.我将这两个响应微缩位,因为我需要速度(文本数据):

Unit: microseconds
         expr     min      lq  median      uq     max
1  Dason_0to9  14.461  15.395  15.861  16.328  25.191
2 Dason_digit  21.926  23.791  24.258  24.725  65.777
3        Dirk 127.354 128.287 128.754 129.686 154.410
4      Greg_1  18.193  19.126  19.127  19.594  27.990
5      Greg_2 125.021 125.954 126.421 127.353 185.666
Run Code Online (Sandbox Code Playgroud)

给大家+1.

Das*_*son 9

您可以使用数字本身替换模式(逗号后跟数字).

x <- "I want to see 102,345,5 dogs, but not too soo; it's 3,242 minutes away"
gsub(",([[:digit:]])", "\\1", x)
#[1] "I want to see 1023455 dogs, but not too soo; it's 3242 minutes away"
#or
gsub(",([0-9])", "\\1", x)
#[1] "I want to see 1023455 dogs, but not too soo; it's 3242 minutes away"
Run Code Online (Sandbox Code Playgroud)

  • 但是不要快.;-) (3认同)
  • 对于数字之间的逗号,使用`([0-9]),([0-9])`可能会更加谨慎. (2认同)

Dir*_*tel 7

使用Perl regexp,并专注于"数字逗号数字",然后我们只用数字替换:

R> x <- "I want to see 102,345,5 dogs, but not too soo; it's 3,242 minutes away"
R> gsub("(\\d),(\\d)", "\\1\\2", x, perl=TRUE)
[1] "I want to see 1023455 dogs, but not too soo; it's 3242 minutes away"
R> 
Run Code Online (Sandbox Code Playgroud)


Gre*_*now 6

这里有几个选项:

> tmp <- "I want to see 102,345,5 dogs, but not too soo; it's 3,242 minutes away"
> gsub('([0-9]),([0-9])','\\1\\2', tmp )
[1] "I want to see 1023455 dogs, but not too soo; it's 3242 minutes away"
> gsub('(?<=\\d),(?=\\d)','',tmp, perl=TRUE)
[1] "I want to see 1023455 dogs, but not too soo; it's 3242 minutes away"
> 
Run Code Online (Sandbox Code Playgroud)

它们都匹配一个数字后跟一个逗号后跟一个数字.该[0-9]\d(额外\逃脱第二个,这样它使通过到正规epression)都匹配单个数字.

第一个epression捕获逗号前的数字和逗号后的数字,并在替换字符串中使用它们.基本上把它们拉出来并放回去(但不要把逗号放回去).

第二个版本使用零长度匹配,(?<=\\d)表示在逗号之前需要一个数字才能匹配,但数字本身不是匹配的一部分.该(?=\\d)说认为,需要有以便它匹配逗号后一个数字,但不包括在比赛.所以基本上它与逗号匹配,但前提是前后跟一个数字.由于只匹配逗号,替换字符串为空意味着删除逗号.