从data.frame列中提取单词

vel*_*ock 1 r extract dataframe

在我的数据中,有一个列如下:

df <- data.frame(status = c("GET/sfuksd1567","GET/sjsh787","POST/hsfhuks","GET/sfukfiezd17","POST/fshks"), stringsAsFactors = FALSE)
Run Code Online (Sandbox Code Playgroud)

我想自动创建另一个列,它是变量状态的指示器,它只提取"GET"或"POST",如df$ind=c("GET","GET","POST","GET","POST").

我试过这个功能substr,但我没有成功.

原始数据:

> df
           status
1  GET/sfuksd1567
2     GET/sjsh787
3    POST/hsfhuks
4 GET/sfukfiezd17
5      POST/fshks
Run Code Online (Sandbox Code Playgroud)

预期结果:

> df
           status  ind
1  GET/sfuksd1567  GET
2     GET/sjsh787  GET
3    POST/hsfhuks POST
4 GET/sfukfiezd17  GET
5      POST/fshks POST
Run Code Online (Sandbox Code Playgroud)

Dav*_*urg 10

您可以使用正则表达式在反斜杠后删除所有内容

df$ind <- sub("/.*", "", df$status)
df
#            status  ind
# 1  GET/sfuksd1567  GET
# 2     GET/sjsh787  GET
# 3    POST/hsfhuks POST
# 4 GET/sfukfiezd17  GET
# 5      POST/fshks POST
Run Code Online (Sandbox Code Playgroud)

或者,如果你不喜欢正则表达式,你可以试试

library(tidyr)
separate(df, "status", c("ind", "status"))
Run Code Online (Sandbox Code Playgroud)

要么

library(data.table) ## V1.9.6+
setDT(df)[, tstrsplit(status, "/")]
Run Code Online (Sandbox Code Playgroud)

要么

read.table(text = df$status, sep = "/")
Run Code Online (Sandbox Code Playgroud)

最后三个选项只是将status列拆分为两个独立的列.