我正在尝试搜索字符串以对数据帧进行子集化.我的df看起来像这样:
dput(df)
structure(list(Cause = structure(c(2L, 1L), .Label = c("jasper not able to read the property table after the release",
"More than 7000 messages loaded which stuck up"), class = "factor"),
Resolution = structure(1:2, .Label = c("jobs and reports are processed",
"Updated the property table which resolved the issue."), class = "factor")), .Names = c("Cause",
"Resolution"), class = "data.frame", row.names = c(NA, -2L))
Run Code Online (Sandbox Code Playgroud)
我想这样做:
df1<-subset(df, grepl("*MQ*|*queue*|*Queue*", df$Cause))
Run Code Online (Sandbox Code Playgroud)
在"原因"列中搜索MQ或队列或队列,使用匹配的记录对数据帧df进行子集化.它似乎没有工作,它捕获其他记录,MQ,队列或队列字符串不存在.
这是你怎么做的,我可以遵循的任何其他想法?
下面的正则表达式似乎有效.我已经添加了一行代码data.frame
,这是一个更有趣的例子.
我认为问题来自*
你的正则表达式中的s,还添加了大括号来定义组,|
但不认为这是强制性的.
df <- data.frame(Cause=c("jasper not able to read the property table after the release",
"More than 7000 messages loaded which stuck up",
"blabla Queue blabla"),
Resolution = c("jobs and reports are processed",
"Updated the property table which resolved the issue.",
"hop"))
> head(df)
Cause Resolution
1 jasper not able to read the property table after the release jobs and reports are processed
2 More than 7000 messages loaded which stuck up Updated the property table which resolved the issue.
3 blabla Queue blabla hop
> subset(df, grepl("(MQ)|(queue)|(Queue)", df$Cause))
Cause Resolution
3 blabla Queue blabla hop
Run Code Online (Sandbox Code Playgroud)
这是你想要的吗?
归档时间: |
|
查看次数: |
108 次 |
最近记录: |