假设(为了简化)我有一个包含一些控制与治疗数据的表:
Which, Color, Response, Count
Control, Red, 2, 10
Control, Blue, 3, 20
Treatment, Red, 1, 14
Treatment, Blue, 4, 21
Run Code Online (Sandbox Code Playgroud)
对于每种颜色,我想要一个包含控制和治疗数据的行,即:
Color, Response.Control, Count.Control, Response.Treatment, Count.Treatment
Red, 2, 10, 1, 14
Blue, 3, 20, 4, 21
Run Code Online (Sandbox Code Playgroud)
我想这样做的一种方法是在每个控件/处理子集上使用内部合并(在Color列上合并),但是有更好的方法吗?我在想重塑包或堆栈功能可以某种方式做到,但我不确定.
Bra*_*sen 19
使用重塑包.
首先,融化你的data.frame:
x <- melt(df)
Run Code Online (Sandbox Code Playgroud)
然后演员:
dcast(x, Color ~ Which + variable)
Run Code Online (Sandbox Code Playgroud)
根据您正在使用的reshape包的版本,它可能是cast()(重塑)或dcast()(reshape2)
瞧.
添加选项(多年后)....
The typical approach in base R would involve the reshape function (which is generally unpopular because of the multitude of arguments that take time to master). It's a pretty efficient function for smaller datasets, but doesn't always scale well.
reshape(mydf, direction = "wide", idvar = "Color", timevar = "Which")
# Color Response.Control Count.Control Response.Treatment Count.Treatment
# 1 Red 2 10 1 14
# 2 Blue 3 20 4 21
Run Code Online (Sandbox Code Playgroud)
Already covered are cast/dcast from the "reshape" and "reshape2" (and now, dcast.data.table from "data.table", especially useful when you have large datasets). But also from the Hadleyverse, there's "tidyr", which works nicely with the "dplyr" package:
library(tidyr)
library(dplyr)
mydf %>%
gather(var, val, Response:Count) %>% ## make a long dataframe
unite(RN, var, Which) %>% ## combine the var and Which columns
spread(RN, val) ## make the results wide
# Color Count_Control Count_Treatment Response_Control Response_Treatment
# 1 Blue 20 21 3 4
# 2 Red 10 14 2 1
Run Code Online (Sandbox Code Playgroud)
Also to note would be that in a forthcoming version of "data.table", the dcast.data.table function should be able to handle this without having to first melt your data.
该data.table实施dcast允许您将多个列转换为宽幅没有首先熔化,如下所示:
library(data.table)
dcast(as.data.table(mydf), Color ~ Which, value.var = c("Response", "Count"))
# Color Response_Control Response_Treatment Count_Control Count_Treatment
# 1: Blue 3 4 20 21
# 2: Red 2 1 10 14
Run Code Online (Sandbox Code Playgroud)