根据特定值创建新变量

Con*_*lta 5 regex r stringr dplyr

我读了对正则表达式和哈德利韦翰的stringrdplyr包,但无法弄清楚如何得到这个工作.

我在数据框中有图书馆流通数据,电话号码作为字符变量.我想取最初的大写字母并将其作为一个新变量,将字母和句点之间的数字转换为第二个新变量.

Call_Num
HV5822.H4 C47 Circulating Collection, 3rd Floor
QE511.4 .G53 1982 Circulating Collection, 3rd Floor
TL515 .M63 Circulating Collection, 3rd Floor
D753 .F4 Circulating Collection, 3rd Floor
DB89.F7 D4 Circulating Collection, 3rd Floor 
Run Code Online (Sandbox Code Playgroud)

jaz*_*rro 4

使用该stringi包,这将是一种选择。由于您的目标停留在字符串的开头,stri_extract_first()因此效果会很好。[:alpha:]{1,}表示包含多个字母的字母序列。使用stri_extract_first(),您可以识别第一个字母序列。同样,您可以使用 找到第一个数字序列stri_extract_first(x, regex = "\\d{1,}")

x <- c("HV5822.H4 C47 Circulating Collection, 3rd Floor",
       "QE511.4 .G53 1982 Circulating Collection, 3rd Floor",
       "TL515 .M63 Circulating Collection, 3rd Floor",
       "D753 .F4 Circulating Collection, 3rd Floor",
       "DB89.F7 D4 Circulating Collection, 3rd Floor")

library(stringi)

data.frame(alpha = stri_extract_first(x, regex = "[:alpha:]{1,}"), 
           number = stri_extract_first(x, regex = "\\d{1,}"))

#  alpha number
#1    HV   5822
#2    QE    511
#3    TL    515
#4     D    753
#5    DB     89
Run Code Online (Sandbox Code Playgroud)