我有一个非常非结构化的文本文件,我用readLines读取.我想将某些字符串更改为另一个字符串,该字符串位于变量中(下面称为"new").
下面我希望被操纵的文本包括所有术语:"一个","两个","三个"和"四个"一次,而不是"更改"字符串.但是,正如您可以看到sub更改每个元素中的第一个模式,但我需要代码忽略有带引号的新字符串.
请参阅下面的示例代码和数据.
#text to be changed
text <- c("TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT change",
"TEXT TEXT TEXT change TEXT TEXT TEXT TEXT TEXT change",
"TEXT TEXT TEXT change TEXT TEXT TEXT TEXT")
#Variable containing input for text
new <- c("one", "two", "three", "four")
#For loop that I want to include
for (i in 1:length(new)) {
text <- sub(pattern = "change", replace = new[i], x = text)
}
text
Run Code Online (Sandbox Code Playgroud)
这个怎么样?逻辑是,锤击一个字符串,直到它不再有change
.在每次"击中"(change
找到的位置)上,沿着new
向量移动.
text <- c("TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT change",
"TEXT TEXT TEXT change TEXT TEXT TEXT TEXT TEXT change",
"TEXT TEXT TEXT change TEXT TEXT TEXT TEXT")
#Variable containing input for text
new <- c("one", "two", "three", "four")
new.i <- 1
for (i in 1:length(text)) {
while (grepl(pattern = "change", text[i])) {
text[i] <- sub(pattern = "change", replacement = new[new.i], x = text[i])
new.i <- new.i + 1
}
}
text
[1] "TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT one"
[2] "TEXT TEXT TEXT two TEXT TEXT TEXT TEXT TEXT three"
[3] "TEXT TEXT TEXT four TEXT TEXT TEXT TEXT"
Run Code Online (Sandbox Code Playgroud)