在R中聚合,重组每小时时间序列数据

avg*_*avg 6 r time-series

我在R中的数据框中有一年的小时数据:

> str(df.MHwind_load)   # compactly displays structure of data frame
'data.frame':   8760 obs. of  6 variables:
 $ Date         : Factor w/ 365 levels "2010-04-01","2010-04-02",..: 1 1 1 1 1 1 1 1 1 1 ...
 $ Time..HRs.   : int  1 2 3 4 5 6 7 8 9 10 ...
 $ Hour.of.Year : int  1 2 3 4 5 6 7 8 9 10 ...
 $ Wind.MW      : int  375 492 483 476 486 512 421 396 456 453 ...
 $ MSEDCL.Demand: int  13293 13140 12806 12891 13113 13802 14186 14104 14117 14462 ...
 $ Net.Load     : int  12918 12648 12323 12415 12627 13290 13765 13708 13661 14009 ...
Run Code Online (Sandbox Code Playgroud)

在保留每小时结构的同时,我想知道如何提取

  1. 特定月份/月份
  2. 每个月的第一天/第一周等
  3. 所有星期一,全年的星期二等

我尝试过使用"cut"而没有结果,在网上看之后认为"lubridate"可能会这样做,但是没有找到合适的例子.我非常感谢这个问题的帮助.

编辑:数据框中的数据样本如下:

  Date Hour.of.Year  Wind.MW  datetime
1  2010-04-01  1  375  2010-04-01  00:00:00
2  2010-04-01  2  492  2010-04-01  01:00:00
3  2010-04-01  3  483  2010-04-01  02:00:00
4  2010-04-01  4  476  2010-04-01  03:00:00
5  2010-04-01  5  486  2010-04-01  04:00:00
6  2010-04-01  6  512  2010-04-01  05:00:00
7  2010-04-01  7  421  2010-04-01  06:00:00
8  2010-04-01  8  396  2010-04-01  07:00:00
9  2010-04-01  9  456  2010-04-01  08:00:00
10  2010-04-01  10  453  2010-04-01  09:00:00
..  ..  ...  ..........  ........
8758  2011-03-31  8758  302  2011-03-31  21:00:00
8759  2011-03-31  8759  378  2011-03-31  22:00:00
8760  2011-03-31  8760  356  2011-03-31  23:00:00
Run Code Online (Sandbox Code Playgroud)

编辑:我希望在同一数据集上执行的其他基于时间的操作1.对所有数据点执行每小时平均值,即一年中每天的第一个小时内所有值的平均值.输出将是全年的"小时概况"(24个时间点)2.每周和每个月执行相同的操作,即分别获得52和12小时的概况3.季节性平均值,例如6月至9月

mpi*_*tas 6

转换的时间,其lubridate理解并然后使用的功能的格式month,mday,wday分别.

假设您有一个data.frame,时间存储在列中Date,那么您的问题的答案将是:

 ###dummy data.frame
 df <- data.frame(Date=c("2012-01-01","2012-02-15","2012-03-01","2012-04-01"),a=1:4) 
 ##1. Select rows for particular month
 subset(df,month(Date)==1)

 ##2a. Select the first day of each month
 subset(df,mday(Date)==1)

 ##2b. Select the first week of each month
 ##get the week numbers which have the first day of the month
 wkd <- subset(week(df$Date),mday(df$Date)==1)
 ##select the weeks with particular numbers
 subset(df,week(Date) %in% wkd)     

 ##3. Select all mondays 
 subset(df,wday(Date)==1)
Run Code Online (Sandbox Code Playgroud)


con*_*ior 6

  1. 首先切换到Date表示:as.Date(df.MHwind_load$Date)
  2. 然后调用weekdays日期向量以获得标记为星期几的新因子
  3. 然后调用months日期向量以获取标有月份名称的新因子
  4. (可选)创建years变量(见下文).

现在subset数据框使用这些相关的组合.步骤2.得到你的任务的答案3.步骤3.和4.让你进入任务1.任务2可能需要一行或两行R.或者只选择对应于,例如,一个月内所有星期一的行和打电话unique,或者duplicated对结果的改变.

为了让你去...

newdf <- df.MHwind_load ## build an augmented data set
newdf$d <- as.Date(newdf$Date)
newdf$month <- months(newdf$d)
newdf$day <- weekdays(newdf$d)

## for some reason R has no years function.  Here's one
years <- function(x){ format(as.Date(x), format = "%Y") }

newdf$year <- years(newdf$d)

# get observations from January to March of every year
subset(newdf, month %*% in c('January', 'February', 'March'))

# get all Monday observations
subset(newdf, day == 'Monday')

# get all Mondays in 1999
subset(newdf, day == 'Monday' & year == '1999')

# slightly fancier: _first_ Monday of each month
# get the first weeks
first.week.of.month <- !duplicated(cbind(newdf$month, newdf$day)) 
# now pull out the mondays
subset(newdf, first.monday.of.month & day=='Monday')
Run Code Online (Sandbox Code Playgroud)