使用for循环替换非结构化文本文件中的单词

Gor*_*orp 9 loops for-loop r

我有一个非常非结构化的文本文件,我用readLines读取.我想将某些字符串更改为另一个字符串,该字符串位于变量中(下面称为"new").

下面我希望被操纵的文本包括所有术语:"一个","两个","三个"和"四个"一次,而不是"更改"字符串.但是,正如您可以看到sub更改每个元素中的第一个模式,但我需要代码忽略有带引号的新字符串.

请参阅下面的示例代码和数据.

 #text to be changed
 text <- c("TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT change",
        "TEXT TEXT TEXT change TEXT TEXT TEXT TEXT TEXT change", 
        "TEXT TEXT TEXT change TEXT TEXT TEXT TEXT")

 #Variable containing input for text
 new <- c("one", "two", "three", "four")
 #For loop that I want to include 
 for (i in 1:length(new)) {

   text  <- sub(pattern = "change", replace = new[i], x = text)

 }
 text
Run Code Online (Sandbox Code Playgroud)

Rom*_*rik 9

这个怎么样?逻辑是,锤击一个字符串,直到它不再有change.在每次"击中"(change找到的位置)上,沿着new向量移动.

text <- c("TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT change",
          "TEXT TEXT TEXT change TEXT TEXT TEXT TEXT TEXT change", 
          "TEXT TEXT TEXT change TEXT TEXT TEXT TEXT")

#Variable containing input for text
new <- c("one", "two", "three", "four")
new.i <- 1

for (i in 1:length(text)) {
  while (grepl(pattern = "change", text[i])) {
    text[i] <- sub(pattern = "change", replacement = new[new.i], x = text[i])
    new.i <- new.i + 1
  }
}
text

[1] "TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT one" 
[2] "TEXT TEXT TEXT two TEXT TEXT TEXT TEXT TEXT three"
[3] "TEXT TEXT TEXT four TEXT TEXT TEXT TEXT" 
Run Code Online (Sandbox Code Playgroud)