仅仅在没有与R中的其他部分匹配的情况下专门地"覆盖""SN","+ SN"或"-SN"

eth*_*ane 0 grep r

因此,我尝试grep从"天气状况"列中查看数据,该列具有针对不同天气类型的多个指标.我试图分别"+ SN","SN"和"-SN",但我很难避免部分匹配.

以下是要插入的列中可能包含的内容的示例:

c("-SN", " ", "SN FR", "HZ +SN", "SN", "+SN", " ", "+BC -SN")

Grepping"-SN"很好,但是grepping"+ SN"很棘手,因为+是一个正则表达式运算符本身.使用转义字符给我以下错误:

> grep( "\+SN" ,aa) Error: '\+' is an unrecognized escape in character string starting ""\+"

此外,在不获得"+ SN"或"-SN"的情况下轻击"SN"是一项挑战.正如您所看到的,我无法使用^SN$^SN排除+或 - 符号,因为一列中可能有多个指标,而我正在寻找的指标可能位于另一个指标的前面或后面.R中有grep !=还是-v等价的吗?你会怎么样这样的?R中的正则表达式在功能上似乎更受限制.

谢谢.

Avi*_*Raj 5

您需要使用基于负面外观的正则表达式.

> x <- c("-SN", " ", "SN FR", "HZ +SN", "SN", "+SN", " ", "+BC -SN")
> regmatches(x, regexpr("(?<!\\S)[-+]?SN(?!\\S)", x, perl=TRUE))
[1] "-SN" "SN"  "+SN" "SN"  "+SN" "-SN"
Run Code Online (Sandbox Code Playgroud)

(?<!\\S) 断言匹配不会以非空格字符开头.

要么

按顺序使用锚点进行精确的字符串匹配.

> x <- c("-SN", " ", "SN FR", "HZ +SN", "SN", "+SN", " ", "+BC -SN")
> regmatches(x, regexpr("^[-+]?SN$", x))
[1] "-SN" "SN"  "+SN"
Run Code Online (Sandbox Code Playgroud)

要么

> grep("^[-+]?SN$", x, value=TRUE)
[1] "-SN" "SN"  "+SN"
Run Code Online (Sandbox Code Playgroud)

要么

要获得SN单独的,即,SN它不是由前面+-

> x <- c("-SN", " ", "SN FR", "HZ +SN", "SN", "+SN", " ", "+BC -SN")
> regmatches(x, regexpr("(?<![+-])SN\\b", x, perl=TRUE))
[1] "SN" "SN"
Run Code Online (Sandbox Code Playgroud)