Chr*_*ris 1 string r dataframe
我有一个我希望搜索的3列数据框.我有一个我想在每列中搜索的字符串列表.我想返回一个包含原始数据的数据帧,以及一个字符串列表中每个字符串的列,以及是否在该行的列中找到该字符串的指示符.
这是一个近似我的数据的简化版本.
strings <- c("ape", "bear", "cat", "dog")
# A tibble: 7 x 3
snippet headline abstract
<chr> <chr> <chr>
1 this is an ape An ape some random
2 blah blah blah An ape some random
3 this is some random text some random text some ape stuff
4 this is a bear this is a bear bear time
5 some cat text bear time dog time
6 cat and dog text blah blah blah
7 blah blah blah this is just text blah
Run Code Online (Sandbox Code Playgroud)
输出输出(df):
dput(df)
structure(list(snippet = c("this is an ape", "blah blah blah",
"this is some random text", "this is a bear", "some cat text",
"cat and dog text", "blah blah blah"), headline = c("An ape",
"An ape", "some random text", "this is a bear", "bear time",
"blah blah", "this is just text"), abstract = c("some random",
"some random", "some ape stuff", "bear time", "dog time", "blah",
"blah")), .Names = c("snippet", "headline", "abstract"), class = c("tbl_df",
"tbl", "data.frame"), row.names = c(NA, -7L))
Run Code Online (Sandbox Code Playgroud)
我希望它返回类似下面的数据帧
# A tibble: 7 x 7
snippet headline abstract ape bear cat dog
<chr> <chr> <chr> <lgl> <lgl> <lgl> <lgl>
1 this is an ape An ape some random TRUE FALSE FALSE FALSE
2 blah blah blah An ape some random TRUE FALSE FALSE FALSE
3 this is some random text some random text some ape stuff TRUE FALSE FALSE FALSE
4 this is a bear this is a bear bear time FALSE TRUE FALSE FALSE
5 some cat text bear time dog time FALSE TRUE TRUE FALSE
6 cat and dog text blah blah blah FALSE FALSE TRUE TRUE
7 blah blah blah this is just text blah FALSE FALSE FALSE FALSE
Run Code Online (Sandbox Code Playgroud)
我已经使用grepl来返回所需的行,但显然有更好的方法来执行此操作并跟踪哪个字符串正在命中哪一行
预先感谢您的帮助
由于您不需要指定找到字符串的列,您可以将每行折叠为单个字符串列,并在其中搜索/ grepl
就像是
strings <- c("ape", "bear", "cat", "dog")
df$colStrings <- with(df, paste(snippet, headline, abstract, sep = ","))
sapply(strings, function(x) grepl(x, df$colStrings))
# ape bear cat dog
# [1,] TRUE FALSE FALSE FALSE
# [2,] TRUE FALSE FALSE FALSE
# [3,] TRUE FALSE FALSE FALSE
# [4,] FALSE TRUE FALSE FALSE
# [5,] FALSE TRUE TRUE TRUE
# [6,] FALSE FALSE TRUE TRUE
# [7,] FALSE FALSE FALSE FALSE
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
53 次 |
| 最近记录: |