Pao*_*tto 2 r time-series dplyr tidyverse
我有一个实验数据.我们计划人类决策.我们有一组交替(我们称之为A,B,C,D)来重复选择超过30秒的时间段,我们计时第一个,然后是第二个,然后是第N个选择(主题可以改变他们的想法).数据看起来像这样(以毫秒为单位的时间):
subject time choice
1 2204 A
1 3673 B
1 8435 C
1 12640 B
1 24031 A
Run Code Online (Sandbox Code Playgroud)
我想离散和扩展数据,以便能够在每一秒选择选项; 每次没有选择时(默认)默认为0.理想情况下,它看起来像这样
subject second choice
1 1 0
1 2 0
1 3 A
1 4 B
1 5 B
1 6 B
1 7 B
1 8 B
1 9 C
1 10 C
1 11 C
1 12 C
1 13 B
Run Code Online (Sandbox Code Playgroud)
......依此类推至秒= 30.
基于tidyverse软件包和dplyr管道的解决方案将是最受欢迎的.但我对其他解决方案持开放态度.谢谢!
library(dplyr)
library(tidyr)
library(zoo)
df %>%
mutate(time=ceiling(time/1000)) %>%
complete(subject, time=1:30) %>%
group_by(subject) %>%
mutate(choice = na.locf(choice, na.rm = FALSE))
Run Code Online (Sandbox Code Playgroud)
数据
df = structure(list(subject = c(1L, 1L, 1L, 1L, 1L), time = c(2204L,
3673L, 8435L, 12640L, 24031L), choice = c("A", "B", "C", "B",
"A")), .Names = c("subject", "time", "choice"), class = "data.frame", row.names = c(NA,
-5L))
Run Code Online (Sandbox Code Playgroud)