当然我可以替换这样的特定参数:
mydata=c("á","é","ó")
mydata=gsub("á","a",mydata)
mydata=gsub("é","e",mydata)
mydata=gsub("ó","o",mydata)
mydata
Run Code Online (Sandbox Code Playgroud)
但是肯定有一种更容易的方法来完成这一切,对吧?我没有发现gsub帮助非常全面.
The*_*ras 33
一个有趣的问题!我认为最简单的选择是设计一个特殊的函数,比如"multi"gsub():
mgsub <- function(pattern, replacement, x, ...) {
if (length(pattern)!=length(replacement)) {
stop("pattern and replacement do not have the same length.")
}
result <- x
for (i in 1:length(pattern)) {
result <- gsub(pattern[i], replacement[i], result, ...)
}
result
}
Run Code Online (Sandbox Code Playgroud)
这给了我:
> mydata <- c("á","é","ó")
> mgsub(c("á","é","ó"), c("a","e","o"), mydata)
[1] "a" "e" "o"
Run Code Online (Sandbox Code Playgroud)
Rco*_*ter 25
也许这可能有用:
iconv('áéóÁÉÓçã', to="ASCII//TRANSLIT")
[1] "aeoAEOca"
Run Code Online (Sandbox Code Playgroud)
Mac*_*iej 11
您可以使用stringi包来替换这些字符.
> stri_trans_general(c("á","é","ó"), "latin-ascii")
[1] "a" "e" "o"
Run Code Online (Sandbox Code Playgroud)
这与@kith非常相似,但是在函数形式中,以及最常见的diacritcs案例:
removeDiscritics <- function(string) {
chartr(
"ŠŽšžŸÀÁÂÃÄÅÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖÙÚÛÜÝàáâãäåçèéêëìíîïðñòóôõöùúûüýÿ"
,"SZszYAAAAAACEEEEIIIIDNOOOOOUUUUYaaaaaaceeeeiiiidnooooouuuuyy"
, string
)
}
removeDiscritics("test áéíóú")
Run Code Online (Sandbox Code Playgroud)
"测试aeiou"
另一个mgsub实现使用Reduce
mystring = 'This is good'
myrepl = list(c('o', 'a'), c('i', 'n'))
mgsub2 <- function(myrepl, mystring){
gsub2 <- function(l, x){
do.call('gsub', list(x = x, pattern = l[1], replacement = l[2]))
}
Reduce(gsub2, myrepl, init = mystring, right = T)
}
Run Code Online (Sandbox Code Playgroud)
上面的一些实现(例如,Theodore Lytras的)的问题在于,如果模式是多个字符,则在一个模式是另一个模式的子串的情况下它们可能冲突.解决此问题的方法是创建对象的副本并在该副本中执行模式替换.这是在我的软件包bayesbio中实现的,可在CRAN上使用.
mgsub <- function(pattern, replacement, x, ...) {
n = length(pattern)
if (n != length(replacement)) {
stop("pattern and replacement do not have the same length.")
}
result = x
for (i in 1:n) {
result[grep(pattern[i], x, ...)] = replacement[i]
}
return(result)
}
Run Code Online (Sandbox Code Playgroud)
这是一个测试用例:
asdf = c(4, 0, 1, 1, 3, 0, 2, 0, 1, 1)
res = mgsub(c("0", "1", "2"), c("10", "11", "12"), asdf)
Run Code Online (Sandbox Code Playgroud)