R grep和完全匹配

Question

R grep和完全匹配

似乎grep在返回匹配的方式上是"贪婪的".假设我有以下数据:

Sources <- c(
                "Coal burning plant",
                "General plant",
                "coalescent plantation",
                "Charcoal burning plant"
        )

Registry <- seq(from = 1100, to = 1103, by = 1)

df <- data.frame(Registry, Sources)

Run Code Online (Sandbox Code Playgroud)

如果我执行grep("(?=.*[Pp]lant)(?=.*[Cc]oal)", df$Sources, perl = TRUE, value = TRUE),它会返回

"Coal burning plant"     
"coalescent plantation"  
"Charcoal burning plant"

Run Code Online (Sandbox Code Playgroud)

但是,我只想返回完全匹配,即只发生"煤"和"植物"的地方.我不想要"合并","种植园"等.所以对此,我只想看"Coal burning plant"

Answer 1

hwn*_*wnd 8

您希望\b在单词模式周围使用单词边界.单词边界不消耗任何字符.它断言,一方面有一个字符,而另一方则没有.您可能还需要考虑使用内联(?i)修饰符进行不区分大小写的匹配.

grep('(?i)(?=.*\\bplant\\b)(?=.*\\bcoal\\b)', df$Sources, perl=T, value=T)

Run Code Online (Sandbox Code Playgroud)

工作演示

归档时间：	11 年，8 月前
查看次数：	5490 次
最近记录：	11 年，8 月前