我有一个大型数据框,其中包含12列,分别用于两种类型的值:Rested和Active.我想将每个月的列转换为行,从而将所有月份列(Jan,Feb,Mar ...)置于"Month"下
我的数据如下:
ID L1 L2 Year JR FR MR AR MYR JR JLR AGR SR OR NR DR JA FA MA AA MYA JA JLA AGA SA OA NA DA
1234 89 65 2003 11 34 6 7 8 90 65 54 3 22 55 66 76 86 30 76 43 67 13 98 67 0 127 74
1234 45 76 2004 67 87 98 5 4 3 77 8 99 76 56 4 3 2 65 78 44 53 67 98 79 53 23 65
Run Code Online (Sandbox Code Playgroud)
我试图让它显示如下(列R代表Rested,A列代表Active.月度JR,FR,MR分别表示Jan Rested,2月Rested,Mar Rested和JA,FA,MA分别表示Jan Active,2月活跃,活跃等等):
所以,在这里我试图通过创建一个新的Month列,将每个每月列转换为行并使它们彼此相邻以获得R和A值.
ID L1 L2 Year Month R A
1234 89 65 2003 Jan 11 76
1234 89 65 2003 Feb 34 86
1234 89 65 2003 Mar 6 30
1234 89 65 2003 Apr 7 76
1234 89 65 2003 May 8 43
1234 89 65 2003 Jun 90 67
1234 89 65 2003 Jul 65 13
1234 89 65 2003 Aug 54 98
1234 89 65 2003 Sep 3 67
1234 89 65 2003 Oct 22 0
1234 89 65 2003 Nov 55 127
1234 89 65 2003 Dec 66 74
1234 45 76 2004 Jan 67 3
1234 45 76 2004 Feb 87 2
1234 45 76 2004 Mar 98 65
1234 45 76 2004 Apr 5 78
1234 45 76 2004 May 4 44
1234 45 76 2004 Jun 3 53
1234 45 76 2004 Jul 77 67
1234 45 76 2004 Aug 8 98
1234 45 76 2004 Sep 99 79
1234 45 76 2004 Oct 76 53
1234 45 76 2004 Nov 56 23
1234 45 76 2004 Dec 4 65
Run Code Online (Sandbox Code Playgroud)
我已经试过各种事情一样stack,melt,unlist
data_reshape <- reshape(df,direction="long", varying=list(c("JR", "FR", "MR", "AR", "MYR", "JR", "JLR", "AGR", "SR", "OR", "NR", "DR", "JA", "FA","MA", "AA", "MYA", "JA", "JLA","AGA", "SA", "OA","NA", "DA")), v.names="Precipitation", timevar="Month")
data_stacked <- stack(data, select = c("JR", "FR", "MR", "AR", "MYR", "JR", "JLR", "AGR", "SR", "OR", "NR", "DR", "JA", "FA","MA", "AA", "MYA", "JA", "JLA","AGA", "SA", "OA","NA", "DA"))
Run Code Online (Sandbox Code Playgroud)
但他们的结果并不是很令人期待 - 他们给出了所有年份的Jan值,然后给出了所有年份的2月值,然后给出了所有年份的3月值等等.但是我希望每年以适当的月度方式构建它们.对于整个数据集中存在的每个ID.
如何在R中实现这一目标?
这是使用devel版本的可能解决方案data.table
library(data.table) ## v >= 1.9.5
res <- melt(setDT(df),
id = 1:4, ## id variables
measure = list(5:16, 17:ncol(df)), # a list of two groups of measure variables
variable = "Month", # The name of the additional variable
value = c("R", "A")) # The names of the grouped variables
setorder(res, ID, -L1, L2, Year) ## Reordering the data to match the desired output
res[, Month := month.abb[Month]] ## You don't really need this part as you already have the months numbers
# ID L1 L2 Year Month R A
# 1: 1234 89 65 2003 Jan 11 76
# 2: 1234 89 65 2003 Feb 34 86
# 3: 1234 89 65 2003 Mar 6 30
# 4: 1234 89 65 2003 Apr 7 76
# 5: 1234 89 65 2003 May 8 43
# 6: 1234 89 65 2003 Jun 90 67
# 7: 1234 89 65 2003 Jul 65 13
# 8: 1234 89 65 2003 Aug 54 98
# 9: 1234 89 65 2003 Sep 3 67
# 10: 1234 89 65 2003 Oct 22 0
# 11: 1234 89 65 2003 Nov 55 127
# 12: 1234 89 65 2003 Dec 66 74
# 13: 1234 45 76 2004 Jan 67 3
# 14: 1234 45 76 2004 Feb 87 2
# 15: 1234 45 76 2004 Mar 98 65
# 16: 1234 45 76 2004 Apr 5 78
# 17: 1234 45 76 2004 May 4 44
# 18: 1234 45 76 2004 Jun 3 53
# 19: 1234 45 76 2004 Jul 77 67
# 20: 1234 45 76 2004 Aug 8 98
# 21: 1234 45 76 2004 Sep 99 79
# 22: 1234 45 76 2004 Oct 76 53
# 23: 1234 45 76 2004 Nov 56 23
# 24: 1234 45 76 2004 Dec 4 65
Run Code Online (Sandbox Code Playgroud)
安装说明:
library(devtools)
install_github("Rdatatable/data.table", build_vignettes = FALSE)
Run Code Online (Sandbox Code Playgroud)
这是一个基本重塑方法:
res <- reshape(mydf, direction="long", varying=list(5:16, 17:28), v.names=c("R", "A"), times = month.name, timevar = "Month")
res[with(res, order(ID, -L1, L2, Year)), -8]
Run Code Online (Sandbox Code Playgroud)