重新排列R中的数据帧

Cap*_*rog 2 r matrix reshape dataframe

我试图(有效地)重新排列R中的数据帧.

我的数据是来自两个参与者群体(1或0,即疾病和对照组)的四个不同实验收集的实验数据.

示例数据帧:

Subject type    Experiment 1    Experiment 2    Experiment 3    Experiment 4
           0             4.6             2.5             1.4             5.3
           0             4.7             2.4             1.8             5.1
           1             3.5             1.2             5.6             7.5
           1             3.8             1.7             6.2             8.1
Run Code Online (Sandbox Code Playgroud)

我想重新排列我的数据帧,使其结构如下(原因是,当它在R中构造时,它使我更容易在数据上运行函数):

Subject type    Experiment    Measure
           0             1        4.6
           0             2        2.5
           0             3        1.4
           0             4        5.3
           0             1        4.7
           0             2        2.4
           0             3        1.8
           0             4        5.1
           1             1        3.5
           1             2        1.2
           1             3        5.6
           1             4        7.5
           1             1        3.8
           1             2        1.7
           1             3        6.2
           1             4        8.1
Run Code Online (Sandbox Code Playgroud)

如您所见,发生的事情是每个主题现在占据四行; 现在,每行都属于单个测量而不是单个测量.这是(至少现在)我更方便插入R功能.也许及时我会找到一种完全跳过这一步的方法,但我是R的新手,这似乎是最好的做事方式.

无论如何 - 问题是,进行这种数据帧转换的最有效方法是什么?目前我这样做:

# Input dframe1
dframe1 <- structure(list(subject_type = c(0L, 0L, 1L, 1L), experiment_1 = c(4.6, 
4.7, 3.5, 3.8), experiment_2 = c(2.5, 2.4, 1.2, 1.7), experiment_3 = c(1.4, 
1.8, 5.6, 6.2), experiment_4 = c(5.3, 5.1, 7.5, 8.1)), .Names = c("subject_type", 
"experiment_1", "experiment_2", "experiment_3", "experiment_4"
), class = "data.frame", row.names = c(NA, -4L))

# Create a matrix
temporary_matrix <- matrix(ncol=3, nrow=nrow(dframe1) * 4)
colnames(temporary_matrix) <- c("subject_type","experiment","measure")

# Rearrange dframe1 so that a different measure is in each column
for(i in 1:nrow(dframe1)) {
  temporary_matrix[i*4-3,"subject_type"] <- dframe1$subject_type[i]
  temporary_matrix[i*4-3,"experiment"] <- 1
  temporary_matrix[i*4-3,"measure"] <- dframe1$experiment_1[i]
  temporary_matrix[i*4-2,"subject_type"] <- dframe1$subject_type[i]
  temporary_matrix[i*4-2,"experiment"] <- 2
  temporary_matrix[i*4-2,"measure"] <- dframe1$experiment_2[i]
  temporary_matrix[i*4-1,"subject_type"] <- dframe1$subject_type[i]
  temporary_matrix[i*4-1,"experiment"] <- 3
  temporary_matrix[i*4-1,"measure"] <- dframe1$experiment_3[i]
  temporary_matrix[i*4-0,"subject_type"] <- dframe1$subject_type[i]
  temporary_matrix[i*4-0,"experiment"] <- 4
  temporary_matrix[i*4-0,"measure"] <- dframe1$experiment_4[i]
}

# Convert matrix to a data frame
dframe2 <- data.frame(temporary_matrix)

# NOTE: For some reason, this has to be converted back into a double (at some point above it becomes a factor)
dframe2$measure <- as.double(as.character(dframe2$measure))
Run Code Online (Sandbox Code Playgroud)

当然有更好的方法吗?!

Ric*_*rta 5

使用该reshape2包,这非常简单.

library(reshape2)

# assuming your data.frame is called `dat`
melt(dat, id.vars=c("Subject type"))
Run Code Online (Sandbox Code Playgroud)

如果您愿意,可以使其更加永恒:

newdat <- melt(dat, id.vars=c("Subject type"), variable.name="Experiment", value.name="Measure")

# remove "experiment " from the names, and convert to numeric
newdat$Experiment <- as.numeric(gsub("Experiment\\s*", "", as.character(newdat$Experiment)))
Run Code Online (Sandbox Code Playgroud)