作为试点调查的一部分,我向每个Turker展示了四种选择中的各种选择.数据如下所示:
> so
WorkerId pio_1_1 pio_1_2 pio_1_3 pio_1_4 pio_2_1 pio_2_2 pio_2_3 pio_2_4
1 1 Yes No No No No No Yes No
2 2 No Yes No No Yes No Yes No
3 3 Yes Yes No No Yes No Yes No
Run Code Online (Sandbox Code Playgroud)
我希望它看起来像这样:
WorkerId set pio1 pio2 pio3 pio4
1 1 Yes No No No
1 2 No No Yes No
...
Run Code Online (Sandbox Code Playgroud)
我可以通过一些方法来解决这个问题,其中没有一个看起来非常优雅:
但在我看来,所有这些都忽略了这样一种观点,即你所谓的"双宽"格式的数据有其自己的结构.我很乐意使用reshape2包,但是尽管使用cast()生成了数据,但我没有看到任何可以帮助我真正融化这个data.frame的选项.
建议欢迎.
so <- structure(list(WorkerId = 1:3, pio_1_1 = structure(c(2L, 1L,
2L), .Label = c("No", "Yes"), class = "factor"), pio_1_2 = structure(c(1L,
2L, 2L), .Label = c("No", "Yes"), class = "factor"), pio_1_3 = structure(c(1L,
1L, 1L), .Label = c("No", "Yes"), class = "factor"), pio_1_4 = structure(c(1L,
1L, 1L), .Label = "No", class = "factor"), pio_2_1 = structure(c(1L,
2L, 2L), .Label = c("No", "Yes"), class = "factor"), pio_2_2 = structure(c(1L,
1L, 1L), .Label = c("No", "Yes"), class = "factor"), pio_2_3 = structure(c(2L,
2L, 2L), .Label = c("No", "Yes"), class = "factor"), pio_2_4 = structure(c(1L,
1L, 1L), .Label = "No", class = "factor")), .Names = c("WorkerId",
"pio_1_1", "pio_1_2", "pio_1_3", "pio_1_4", "pio_2_1", "pio_2_2",
"pio_2_3", "pio_2_4"), row.names = c(NA, 3L), class = "data.frame")
Run Code Online (Sandbox Code Playgroud)
我不确定这是否太明显,但就这样吧。它应该是不言自明的。传入您的so数据框,它会返回重新整形的数据。
library("reshape2")
reshape.middle <- function(dat) {
dat <- melt(so, id="WorkerId")
dat$set <- substr(dat$variable, 5,5)
dat$name <- paste(substr(dat$variable, 1, 4),
substr(dat$variable, 7, 7),
sep="")
dat$variable <- NULL
dat <- melt(dat, id=c("WorkerId", "set", "name"))
dat$variable <- NULL
return(dcast(dat, WorkerId + set ~ name))
}
so # initial form
so <- reshape.middle(so)
so # as needed
Run Code Online (Sandbox Code Playgroud)
希望这可以帮助。