使用 str_detect() 和 contains() 之间的区别？

Question

使用 str_detect() 和 contains() 之间的区别？

我知道这可能是一个愚蠢的问题，但我很好奇是否有任何区别，我更喜欢使用 str_detect 因为语法在我的大脑中更有意义。

Answer 1

是的，存在很大差异。首先，contains()是一个“选择助手”，必须在（通常是 tidyverse）选择函数中使用。

所以你不能使用向量或用作contains()独立函数 - 即，你不能这样做：

x <- c("Hello", "and", "welcome (example)") 

tidyselect::contains("Hello", x)

Run Code Online (Sandbox Code Playgroud)

或者您收到错误：

错误：！必须在选择contains()函数中使用。

而stringr::str_detect 可以使用向量并作为独立函数：

stringr::str_detect(x, "Hello")
Run Code Online (Sandbox Code Playgroud)
返回：

[1] TRUE FALSE FALSE
Run Code Online (Sandbox Code Playgroud)
其次，stringr::str_detect()允许使用正则表达式，并且tidyselect::contains仅查找文字字符串。

例如，下面的作品

df <- data.frame(col1 = c("Hello", "and", "welcome (example)")) df %>% select(contains("1")) # col1 # 1 Hello # 2 and # 3 welcome (example)
Run Code Online (Sandbox Code Playgroud)
但这并不：

df %>% select(contains("\\d"))
Run Code Online (Sandbox Code Playgroud)
（\\d是“任意数字”的 R 正则表达式）

此外，正如 @abagail 所指出的，contains查看列名称，而不是存储在列中的值。例如，df %>% filter(contains("1"))上面的工作返回了列col1（因为列名中有一个“1”）。但尝试对filter包含特定模式的值不起作用：

df %>% filter(contains("Hello"))
Run Code Online (Sandbox Code Playgroud)
返回相同的错误：

错误原因： ! 必须在选择contains()函数中使用。

但是您可以使用以下方法过滤列中的值stringr::str_detect()：

df %>% filter(stringr::str_detect(col1, "Hello")) # col1 # 1 Hello
Run Code Online (Sandbox Code Playgroud)
最后，如果您正在寻找之外的类似函数stringr，因为tidyselect::matches()将接受正则表达式，@GregorThomas 在评论中恰当地指出，

“tidyselect::matches是一个更接近的模拟str_detect()- 尽管仍然作为选择助手仅用于选择功能。”

str_detect也相当于基本 R's grepl，尽管模式和字符串的方向相反（即str_detect(string, pattern)相当于grepl(pattern, string)

归档时间：	1 年，11 月前
查看次数：	79 次
最近记录：	1 年，11 月前