如何一次替换字符串中的一个字符并为每次替换生成新字符串?

Pav*_*aha 12 string r stringr

我有一个字符串向量

c("YSAHEEHHYDK", "HEHISSDYAGK", "TFAHTESHISK", "ISLGEHEGGGK", 
"LSSGYDGTSYK", "FGTGTYAGGEK", "VGASTGYSGLK", "TASGVGGFSTK", "SYASDFGSSAK", 
"LYSYYSSTESK")
Run Code Online (Sandbox Code Playgroud)

对于每个字符串,我想用“pY”、“pS”或“pT”替换“Y”、“S”或“T”。但我不希望所有替换都在同一个最终字符串中,我希望每个替换生成一个新字符串,例如

“YSAHEEHHYDK”变成

c("pYSAHEEHHYDK",
"YpSAHEEHHYDK",
"YSAHEEHHpYDK")
Run Code Online (Sandbox Code Playgroud)

Ony*_*mbu 9

你可以用 R 语言编写一个函数:

编辑:

包括零长度的概念,如 @GKi 所示

strings <-  c("YSAHEEHHYDK", "HEHISSDYAGK", "TFAHTESHISK", "ISLGEHEGGGK", 
              "LSSGYDGTSYK", "FGTGTYAGGEK", "VGASTGYSGLK", "TASGVGGFSTK", 
              "SYASDFGSSAK", "LYSYYSSTESK")


reg <- gregexpr("[YST]", strings)
`regmatches<-`(rep(strings, lengths(reg)), 
              `attr<-`(unlist(reg), "match.length", 0),  value = 'p')

#>  [1] "pYSAHEEHHYDK" "YpSAHEEHHYDK" "YSAHEEHHpYDK" "HEHIpSSDYAGK" "HEHISpSDYAGK"
#>  [6] "HEHISSDpYAGK" "pTFAHTESHISK" "TFAHpTESHISK" "TFAHTEpSHISK" "TFAHTESHIpSK"
#> [11] "IpSLGEHEGGGK" "LpSSGYDGTSYK" "LSpSGYDGTSYK" "LSSGpYDGTSYK" "LSSGYDGpTSYK"
#> [16] "LSSGYDGTpSYK" "LSSGYDGTSpYK" "FGpTGTYAGGEK" "FGTGpTYAGGEK" "FGTGTpYAGGEK"
#> [21] "VGApSTGYSGLK" "VGASpTGYSGLK" "VGASTGpYSGLK" "VGASTGYpSGLK" "pTASGVGGFSTK"
#> [26] "TApSGVGGFSTK" "TASGVGGFpSTK" "TASGVGGFSpTK" "pSYASDFGSSAK" "SpYASDFGSSAK"
#> [31] "SYApSDFGSSAK" "SYASDFGpSSAK" "SYASDFGSpSAK" "LpYSYYSSTESK" "LYpSYYSSTESK"
#> [36] "LYSpYYSSTESK" "LYSYpYSSTESK" "LYSYYpSSTESK" "LYSYYSpSTESK" "LYSYYSSpTESK"
#> [41] "LYSYYSSTEpSK"
Run Code Online (Sandbox Code Playgroud)

创建于 2023-02-14,使用reprex v2.0.2

您可以创建一个小函数来帮助您。

my_replace <- function(x){
  reg <- gregexpr("[YST]", x)
  `regmatches<-`(rep(x, lengths(reg)), structure(unlist(reg), match.length = 0), value = "p")
}
Run Code Online (Sandbox Code Playgroud)


G. *_*eck 7

使用xx最后注释中的输入(如问题中加上一些边界测试),我们使用 stringi 函数。特别注意stri_sub可以插入ap字符。如果输入字符串为空,即“”,或不包含任何 Y、S 或 T,则为该字符串返回 NA。

library(stringi)

add_p <- function(s, loc) {
  start <- loc[, "start"]
  stri_sub(s, start, start-1) <- "p"
  s
}
Map(add_p, xx, stri_locate_all(xx, regex = "[YST]"))
Run Code Online (Sandbox Code Playgroud)

给予

[1] NA

$ABC
[1] NA

$YSAHEEHHYDK
[1] "pYSAHEEHHYDK" "YpSAHEEHHYDK" "YSAHEEHHpYDK"

$HEHISSDYAGK
[1] "HEHIpSSDYAGK" "HEHISpSDYAGK" "HEHISSDpYAGK"

$TFAHTESHISK
[1] "pTFAHTESHISK" "TFAHpTESHISK" "TFAHTEpSHISK" "TFAHTESHIpSK"

# ...snip...
Run Code Online (Sandbox Code Playgroud)

笔记

这与问题中的相同,只是我们添加了前两个字符串。

xx <- c("", "ABC", "YSAHEEHHYDK", "HEHISSDYAGK", "TFAHTESHISK", "ISLGEHEGGGK", 
"LSSGYDGTSYK", "FGTGTYAGGEK", "VGASTGYSGLK", "TASGVGGFSTK", "SYASDFGSSAK", 
"LYSYYSSTESK")
Run Code Online (Sandbox Code Playgroud)


mar*_*usl 5

也许与 stringr 和 purrr 类似。

str_locate_all()返回一个 2 列矩阵,其中包含模式位置的开始和结束位置,str_sub(string, start) <- "p"方便地接受 a 的相同矩阵start。从当前结束列中减去 1(即[1, 1]变为[1, 0])保留所有现有字符并插入p

library(stringr)
library(purrr)

str_ <-  c("YSAHEEHHYDK", "HEHISSDYAGK", "TFAHTESHISK", "ISLGEHEGGGK", 
           "LSSGYDGTSYK", "FGTGTYAGGEK", "VGASTGYSGLK", "TASGVGGFSTK", 
           "SYASDFGSSAK", "LYSYYSSTESK")


map2(set_names(str_),
     str_locate_all(str_,"Y|S|T"),
     function(x, y) { 
       y[,2] <- y[,2] - 1
       str_sub(x, y) <- "p"
       x
       })
Run Code Online (Sandbox Code Playgroud)

结果为命名列表:

#> $YSAHEEHHYDK
#> [1] "pYSAHEEHHYDK" "YpSAHEEHHYDK" "YSAHEEHHpYDK"
#> 
#> $HEHISSDYAGK
#> [1] "HEHIpSSDYAGK" "HEHISpSDYAGK" "HEHISSDpYAGK"
#> 
#> $TFAHTESHISK
#> [1] "pTFAHTESHISK" "TFAHpTESHISK" "TFAHTEpSHISK" "TFAHTESHIpSK"
#> 
#> $ISLGEHEGGGK
#> [1] "IpSLGEHEGGGK"
#> 
#> $LSSGYDGTSYK
#> [1] "LpSSGYDGTSYK" "LSpSGYDGTSYK" "LSSGpYDGTSYK" "LSSGYDGpTSYK" "LSSGYDGTpSYK"
#> [6] "LSSGYDGTSpYK"
#> 
#> $FGTGTYAGGEK
#> [1] "FGpTGTYAGGEK" "FGTGpTYAGGEK" "FGTGTpYAGGEK"
#> 
#> $VGASTGYSGLK
#> [1] "VGApSTGYSGLK" "VGASpTGYSGLK" "VGASTGpYSGLK" "VGASTGYpSGLK"
#> 
#> $TASGVGGFSTK
#> [1] "pTASGVGGFSTK" "TApSGVGGFSTK" "TASGVGGFpSTK" "TASGVGGFSpTK"
#> 
#> $SYASDFGSSAK
#> [1] "pSYASDFGSSAK" "SpYASDFGSSAK" "SYApSDFGSSAK" "SYASDFGpSSAK" "SYASDFGSpSAK"
#> 
#> $LYSYYSSTESK
#> [1] "LpYSYYSSTESK" "LYpSYYSSTESK" "LYSpYYSSTESK" "LYSYpYSSTESK" "LYSYYpSSTESK"
#> [6] "LYSYYSpSTESK" "LYSYYSSpTESK" "LYSYYSSTEpSK"
Run Code Online (Sandbox Code Playgroud)

创建于 2023-02-15,使用reprex v2.0.2