将文本与 r 中的数据框列匹配

Nei*_*eil 4 r

我在 r 中有一个词向量。

words = c("Awesome","Loss","Good","Bad")
Run Code Online (Sandbox Code Playgroud)

而且,我在 r 中有以下数据框

ID           Response
1            Today is an awesome day
2            Yesterday was a bad day,but today it is good
3            I have losses today
Run Code Online (Sandbox Code Playgroud)

我想要做的是应该提取响应列中匹配的单词并将其插入到数据框中的新列中。最终输出应如下所示

ID           Response                        Match          Count 
1            Today is an awesome day        Awesome           1
2            Yesterday was a bad day        Bad,Good          2 
             ,but today it is good      
3            I have losses today             Loss             1
Run Code Online (Sandbox Code Playgroud)

我在 r 中做了以下

sapply(words,grepl,df$Response)
Run Code Online (Sandbox Code Playgroud)

它与单词匹配,但是如何以所需格式获取数据框?请帮忙。

joe*_*son 5

使用基础 R - (也归功于 PereG 以帮助简要回答 df$Counts)

# extract the list of matching words
x <- sapply(words, function(x) grepl(tolower(x), tolower(df$Response)))

# paste the matching words together
df$Words <- apply(x, 1, function(i) paste0(names(i)[i], collapse = ","))

# count the number of matching words
df$Count <- apply(x, 1, function(i) sum(i))

# df
#  ID                                     Response    Words Count
#1  1                      Today is an awesome day  Awesome     1
#2  2 Yesterday was a bad day,but today it is good Good,Bad     2
#3  3                          I have losses today     Loss     1
Run Code Online (Sandbox Code Playgroud)

  • 问题似乎包括部分匹配(损失和损失),df$Count &lt;- apply(sapply(tolower(words),grepl,df$Response),1,sum) 有效 (2认同)