我有这种类型的数据:
\ndf <- structure(list(Utterance = c("(5.127)", ">like I don't understand< sorry like how old's your mom\xc2\xbf", \n "(0.855)", "eh six:ty:::-one=", "(0.101)", "(0.487)", "[((v: gasps)) she said] ~no you're [not?]~", \n "[((v: gasps)) she said] ~no you're [not?]~", "~<[NO YOU'RE] NOT (.) you can't go !in!>~", \n "(0.260)", "show her [your boobs] next time"), \n Q = c(NA, "q_wh", "", "", NA, NA, "q_really", "", "", NA, NA), \n Sequ = c(NA, 1L, 1L, 1L, NA, NA, 0L, 0L, 0L, NA, NA)), class = "data.frame", row.names = c(NA, -11L))\nRun Code Online (Sandbox Code Playgroud)\n我想提取/过滤
\nSequ那些不是和的 行NA Sequ是NA)到目前为止,我的尝试是定义一个获取相关行索引的函数:
\nQA_sequ <- function(value) {\n inds <- which(!is.na(value) & lag(is.na(value))) \n sort(unique(c(inds-1, inds)))\n}\nRun Code Online (Sandbox Code Playgroud)\n然后通过索引切出行:
\nlibrary(dplyr)\ndf %>% \n slice(QA_sequ(Sequ))\n Utterance Q Sequ\n1 (5.127) <NA> NA\n2 >like I don't understand< sorry like how old's your mom\xc2\xbf q_wh 1\n3 (0.487) <NA> NA\n4 [((v: gasps)) she said] ~no you're [not?]~ q_really 0\nRun Code Online (Sandbox Code Playgroud)\nSequ但是,仅过滤紧邻的前一行和第一行。我想要获得的结果是这样的:
Utterance Q Sequ\n1 (5.127) <NA> NA\n2 >like I don't understand< sorry like how old's your mom\xc2\xbf q_wh 1\n3 (0.855) 1\n4 eh six:ty:::-one= 1\n5 (0.487) <NA> NA\n6 [((v: gasps)) she said] ~no you're [not?]~ q_really 0\n7 [((v: gasps)) she said] ~no you're [not?]~ 0\n8 ~<[NO YOU'RE] NOT (.) you can't go !in!>~ 0\nRun Code Online (Sandbox Code Playgroud)\n编辑:
\n我想出的解决方案感觉很麻烦:
\nQA_sequ <- function(value) {\n inds <- which(!is.na(value) & lag(is.na(value))) \n sort(unique(c(inds-1))) # extract only preceding row!\n}\n\nlibrary(dplyr)\ndf %>% \n mutate(id = row_number()) %>%\n slice(QA_sequ(Sequ)) %>%\n bind_rows(., df %>% mutate(id = row_number()) %>% filter(!is.na(Sequ))) %>%\n arrange(id)\nRun Code Online (Sandbox Code Playgroud)\n
这个怎么样?
\ndf %>%\n filter(!is.na(Sequ) | lead(!is.na(Sequ), default=FALSE))\n# Utterance Q Sequ\n# 1 (5.127) <NA> NA\n# 2 >like I don\'t understand< sorry like how old\'s your mom\xc2\xbf q_wh 1\n# 3 (0.855) 1\n# 4 eh six:ty:::-one= 1\n# 5 (0.487) <NA> NA\n# 6 [((v: gasps)) she said] ~no you\'re [not?]~ q_really 0\n# 7 [((v: gasps)) she said] ~no you\'re [not?]~ 0\n# 8 ~<[NO YOU\'RE] NOT (.) you can\'t go !in!>~ 0\nRun Code Online (Sandbox Code Playgroud)\n逻辑过滤(提取)以下两者:
\nNA值NA下一个值不是的任何值NA| 归档时间: |
|
| 查看次数: |
95 次 |
| 最近记录: |