Par*_*gue 3 r regex-lookarounds dplyr
我有一个列名的数据框,如下所示:
[127] "quiz.32.player.submitted_answer_private" "quiz.32.player.rescue_event"
[129] "quiz.33.player.solution" "quiz.33.player.submitted_answer"
[131] "quiz.33.player.submitted_answer_private" "quiz.33.player.rescue_event"
[133] "partner_quiz.1.player.solution" "partner_quiz.1.player.submitted_answer"
[135] "partner_quiz.1.player.submitted_answer_private" "partner_quiz.1.player.rescue_event"
[137] "partner_quiz.2.player.solution" "partner_quiz.2.player.submitted_answer"
[139] "partner_quiz.2.player.submitted_answer_private" "partner_quiz.2.player.rescue_event"
Run Code Online (Sandbox Code Playgroud)
我试图通过提取上一期右侧的值和它左侧的值来分离这些值。为此,我的 dplyr 管道如下:
frame <- data %>%
gather(k, value) %>%
separate(k, into = c("quiz_number", "suffix"), sep = "\\.(?=player)")
Run Code Online (Sandbox Code Playgroud)
出于某种原因,生成的 data.frame 省略了所有以“partner”为前缀的列。任何想法为什么?
编辑:生成的拆分应该在列quiz_number中包含上一期左侧的所有内容(例如
quiz.32.player和partner_quiz.2.player),并在“后缀”列中包含上一期右侧的所有内容(例如submitted_answer_private和solution)
代替正则表达式中的“播放器”,对不是.直到$字符串结尾 ( ) 的字符进行正匹配
library(dplyr)
library(tidyr)
data %>%
gather(k, value) %>%
separate(k, into = c("quiz_number", "suffix"), sep = "\\.(?=[^.]+$)")
Run Code Online (Sandbox Code Playgroud)
在 OP 的代码中,它.在 'player' 字符串之前匹配,但.在 'player' 之后有s,例如quiz.32.player.rescue_event