vel*_*ock 1 r extract dataframe
在我的数据中,有一个列如下:
df <- data.frame(status = c("GET/sfuksd1567","GET/sjsh787","POST/hsfhuks","GET/sfukfiezd17","POST/fshks"), stringsAsFactors = FALSE)
Run Code Online (Sandbox Code Playgroud)
我想自动创建另一个列,它是变量状态的指示器,它只提取"GET"或"POST",如df$ind=c("GET","GET","POST","GET","POST").
我试过这个功能substr,但我没有成功.
原始数据:
> df
status
1 GET/sfuksd1567
2 GET/sjsh787
3 POST/hsfhuks
4 GET/sfukfiezd17
5 POST/fshks
Run Code Online (Sandbox Code Playgroud)
预期结果:
> df
status ind
1 GET/sfuksd1567 GET
2 GET/sjsh787 GET
3 POST/hsfhuks POST
4 GET/sfukfiezd17 GET
5 POST/fshks POST
Run Code Online (Sandbox Code Playgroud)
Dav*_*urg 10
您可以使用正则表达式在反斜杠后删除所有内容
df$ind <- sub("/.*", "", df$status)
df
# status ind
# 1 GET/sfuksd1567 GET
# 2 GET/sjsh787 GET
# 3 POST/hsfhuks POST
# 4 GET/sfukfiezd17 GET
# 5 POST/fshks POST
Run Code Online (Sandbox Code Playgroud)
或者,如果你不喜欢正则表达式,你可以试试
library(tidyr)
separate(df, "status", c("ind", "status"))
Run Code Online (Sandbox Code Playgroud)
要么
library(data.table) ## V1.9.6+
setDT(df)[, tstrsplit(status, "/")]
Run Code Online (Sandbox Code Playgroud)
要么
read.table(text = df$status, sep = "/")
Run Code Online (Sandbox Code Playgroud)
最后三个选项只是将status列拆分为两个独立的列.