小编Fra*_*xxx的帖子

使用R中的Grepl查找Dataframe列中存在的单词列表

我有一个数据帧df:

df <- structure(list(page = c(12, 6, 9, 65),
text = structure(c(4L,2L, 1L, 3L), 
.Label = c("I just bought a brand new AudiA6", "Get 2 years engine replacement warranty on BMW X6", 
"Volkswagen is the parent company of BMW", "ToyotaCorolla is offering new car exchange offers"), 
class = "factor")), .Names = c("page","text"), row.names = c(NA, -4L), class = "data.frame")
Run Code Online (Sandbox Code Playgroud)

另外,我有一个单词列表:

wordlist <- c("Audi", "BMW", "extended", "engine", "replacement", "Volkswagen", "company", "Toyota","exchange", "brand")
Run Code Online (Sandbox Code Playgroud)

我通过取消列出文本和使用grepl来查找wordlist中的单词是否存在于列文本中.

library(data.table)
setDT(df)[, match := paste(wordlist[unlist(lapply(wordlist, function(x) grepl(x, text, …
Run Code Online (Sandbox Code Playgroud)

r string-matching grepl data.table

1
推荐指数
1
解决办法
78
查看次数

标签 统计

data.table ×1

grepl ×1

r ×1

string-matching ×1