Die*_*nne 3 r reshape2 dplyr magrittr
下面的问题可以看作是"两列重塑到宽",并且有几种方法可以解决它的经典方式,从base::reshape(恐怖)到reshape2.对于两组情况,一个简单的子组连接效果最好.
我可以在管道框架内重新构建连接dplyr吗?下面的例子有点傻,但我需要加入更长的管道链,我不想打破它.
library(dplyr)
d = data.frame(subject= rep(1:5,each=2),treatment=letters[1:2],bp = rnorm(10))
d %>%
# Assume piped manipulations here
# Make wide
# Assume additional piped manipulations here
# Make wide (old style)
with(d,left_join(d[treatment=="a",],
d[treatment=="b",],by="subject" ))
Run Code Online (Sandbox Code Playgroud)
怎么样
d %>%
filter(treatment == "a") %>%
left_join(., filter(d, treatment == "b"), by = "subject")
# subject treatment.x bp.x treatment.y bp.y
#1 1 a 0.4392647 b 0.6741559
#2 2 a -0.6010311 b 1.9845774
#3 3 a 0.1749082 b 1.7678771
#4 4 a -0.3089731 b 0.4427471
#5 5 a -0.8346091 b 1.7156319
Run Code Online (Sandbox Code Playgroud)
您可以在左连接后立即继续管道.
或者,如果您不需要单独的处理列,您可以使用tidyr来执行:
library(tidyr)
d %>% spread(treatment, bp)
# subject a b
#1 1 0.4392647 0.6741559
#2 2 -0.6010311 1.9845774
#3 3 0.1749082 1.7678771
#4 4 -0.3089731 0.4427471
#5 5 -0.8346091 1.7156319
Run Code Online (Sandbox Code Playgroud)
(这是与使用d %>% dcast(subject ~ treatment, value.var = "bp")从reshape2包通过的Henrik在评论中所指出)
| 归档时间: |
|
| 查看次数: |
5031 次 |
| 最近记录: |