Ale*_*tov 2 regex r abbreviation
我有字符串,其中包含州名.我如何有效地缩写它们?我知道state.abb[grep("New York", state.name)]但只有"纽约"是整个字符串才有效.例如,我有"纽约的沃尔玛".提前致谢!
我们假设这个输入:
x = c("Walmart, New York", "Hobby Lobby (California)", "Sold in Sears in Illinois")
Run Code Online (Sandbox Code Playgroud)
编辑:所需的输出将是la"Walmart,NY","Hobby Lobby(CA)","在IL的西尔斯出售".从这里可以看出,状态可以在字符串中以多种方式出现
这里有一个基础R的方式,使用gregexpr(),regmatches()以及regmatches<-(),:
abbreviateStateNames <- function(x) {
pat <- paste(state.name, collapse="|")
m <- gregexpr(pat, x)
ff <- function(x) state.abb[match(x, state.name)]
regmatches(x, m) <- lapply(regmatches(x, m), ff)
x
}
x <- c("Hobby Lobby (California)",
"Hello New York City, here I come (from Greensboro North Carolina)!")
abbreviateStateNames(x)
# [1] "Hobby Lobby (CA)"
# [2] "Hello NY City, here I come (from Greensboro NC)!"
Run Code Online (Sandbox Code Playgroud)
或者 - 更自然地 - 您可以使用gsubfn包完成相同的事情:
library(gsubfn)
pat <- paste(state.name, collapse="|")
gsubfn(pat, function(x) state.abb[match(x, state.name)], x)
[1] "Hobby Lobby (CA)"
[2] "Hello NY City, here I come (from Greensboro NC)!"
Run Code Online (Sandbox Code Playgroud)