如何在R中第一次出现逗号之前删除所有内容

Question

如何在R中第一次出现逗号之前删除所有内容

我正在尝试删除文本，直到包含一个或多个逗号的字符串中的第一个逗号为止。出于某种原因，我发现这总是删除所有字符串的最后一个逗号之前的所有内容。

字符串看起来像：

OCR - (some text), Variant - (some text), Bad Subtype - (some text)

Run Code Online (Sandbox Code Playgroud)

我的正则表达式正在返回：

Bad Subtype - (some text)

Run Code Online (Sandbox Code Playgroud)

当所需的输出是：

Variant - (some text), Bad Subtype - (some text)

Run Code Online (Sandbox Code Playgroud)

Variant 不能保证排在第二位。

#select all strings beginning with OCR in the column Tags
clean<- subset(all, grepl("^OCR", all$Tags)
#trim the OCR text up to the first comma, and store in a new column called Tag
    clean$Tag<- gsub(".*,", "", clean$Tag)

Run Code Online (Sandbox Code Playgroud)

或者

clean$Tag <- gsub(".*\\,", "", clean$Tag)

Run Code Online (Sandbox Code Playgroud)

或者

clean$Tag<- sub(".*,", "", clean$Tag)

Run Code Online (Sandbox Code Playgroud)

等等..

Answer 1

Rui*_*das 6

这是一个可以完成这项工作的正则表达式。

x <- "OCR - (some text), Variant - (some text), Bad Subtype - (some text) and my regex is returning: Bad Subtype - (some text) when the desired output is: Variant - (some text), Bad Subtype - (some text)"

sub("^[^,]*,", "", x)
#[1] " Variant - (some text), Bad Subtype - (some text) and my regex is returning: Bad Subtype - (some text) when the desired output is: Variant - (some text), Bad Subtype - (some text)"

Run Code Online (Sandbox Code Playgroud)

解释

^字符串的开头；
^[^,]*开头的任何字符（","重复零次或多次除外）；
^[^,]*,上面第 2 点中的模式，后跟一个逗号。

此模式由空字符串替代""。

Answer 2

akr*_*run 5

一个选项trimwsfrombase R

trimws(x, whitespace = "^[^,]+,\\s*")

Run Code Online (Sandbox Code Playgroud)

-输出

#[1] "Variant - (some text), Bad Subtype - (some text) and my regex is returning: Bad Subtype - (some text) when the desired output is: Variant - (some text), Bad Subtype - (some text)"

Run Code Online (Sandbox Code Playgroud)

数据

x <- "OCR - (some text), Variant - (some text), Bad Subtype - (some text) and my regex is returning: Bad Subtype - (some text) when the desired output is: Variant - (some text), Bad Subtype - (some text)"

Run Code Online (Sandbox Code Playgroud)

归档时间：	4 年，10 月前
查看次数：	107 次
最近记录：	4 年，10 月前