在 R 中使用 lubridate 从日期确定季节

Rat*_*nil 9 r posixct lubridate

我有一个非常大的数据集,其中DateTime包含一个包含 POSIXct-Values的列。我需要根据列确定季节(冬季 - 夏季)DateTime。我创建了一个在小数据集上运行良好的函数,但是当我在大数据集上使用它时会崩溃。有人能看到我的错误吗?

我创建了 4 个函数:

  • 3 个子函数,以便我可以使用 *apply 函数进行逻辑比较和选择
  • 1 判断季节的函数

下面是函数:

require(lubridate)

# function for logical comparison (to be used in *apply)
greaterOrEqual <- function(x,y){
  ifelse(x >= y,T,F)
}

# function for logical comparison (to be used in *apply)
less <- function(x,y){
  ifelse(x < y,T,F)
}

# function for logical comparison (to be used in *apply)
selFromLogic <- function(VecLogic,VecValue){
  VecValue[VecLogic]
}

# Main Function to determine the season
getTwoSeasons <- function(input.date) {
  Winter1Start <- as.POSIXct("2000-01-01 00:00:00", tz = "UTC")
  Winter1End <- as.POSIXct("2000-04-15 23:59:59", tz = "UTC")

  SummerStart <- Winter1End + 1
  SummerEnd <- as.POSIXct("2000-10-15 23:59:59", tz = "UTC")

  Winter2Start <- SummerEnd + 1
  Winter2End <- as.POSIXct("2000-12-31 00:00:00", tz = "UTC")

  year(input.date) <- year(Winter1Start)
  attr(input.date, "tzone") <- attr(Winter1Start, "tzone")

  SeasonStart <- c(Winter1Start,SummerStart,Winter2Start)
  SeasonsEnd <- c(Winter1End,SummerEnd,Winter2End)
  Season_names <- as.factor(c("WinterHalfYear","SummerHalfYear","WinterHalfYear"))

  Season_select <- sapply(SeasonStart, greaterOrEqual, x = input.date) & sapply(SeasonsEnd, less, x = input.date)
  Season_return <- apply(Season_select,MARGIN = 1,selFromLogic,VecValue = Season_names)

  return(Season_return)
}
Run Code Online (Sandbox Code Playgroud)

这是测试该功能的方法:

dates <- Sys.time() + seq(0,10000,10)
getTwoSeasons(dates)
Run Code Online (Sandbox Code Playgroud)

我会感谢任何帮助,这让我发疯!

Jim*_* G. 10

如果您有兴趣重温四个赛季,请使用以下代码:

library(lubridate)
getSeason <- function(input.date){
  numeric.date <- 100*month(input.date)+day(input.date)
  ## input Seasons upper limits in the form MMDD in the "break =" option:
  cuts <- base::cut(numeric.date, breaks = c(0,319,0620,0921,1220,1231)) 
  # rename the resulting groups (could've been done within cut(...levels=) if "Winter" wasn't double
  levels(cuts) <- c("Winter","Spring","Summer","Fall","Winter")
  return(cuts)
}
Run Code Online (Sandbox Code Playgroud)

单元测试:

getSeason(as.POSIXct("2016-01-01 12:00:00")+(0:365)*(60*60*24))
Run Code Online (Sandbox Code Playgroud)


Oma*_*sow 5

为了完整起见,值得注意的是lubridate现在有一个季度(和一个学期)的功能。quarter将一年分成四份和semester两份:

library(lubridate)

quarter(x, with_year = FALSE, fiscal_start = 1)
semester(x, with_year = FALSE)
Run Code Online (Sandbox Code Playgroud)

有关更多信息,请参阅:https : //www.rdocumentation.org/packages/lubridate/versions/1.7.4/topics/quarter


Rat*_*nil 3

我将 @Lars Arne Jordanger 更优雅的方法打包到一个函数中:

getTwoSeasons <- function(input.date){
  numeric.date <- 100*month(input.date)+day(input.date)
  ## input Seasons upper limits in the form MMDD in the "break =" option:
  cuts <- base::cut(numeric.date, breaks = c(0,415,1015,1231)) 
  # rename the resulting groups (could've been done within cut(...levels=) if "Winter" wasn't double
  levels(cuts) <- c("Winter", "Summer","Winter")
  return(cuts)
}
Run Code Online (Sandbox Code Playgroud)

在一些示例数据上进行测试似乎效果很好:

getTwoSeasons(as.POSIXct("2016-01-01 12:00:00")+(0:365)*(60*60*24))
Run Code Online (Sandbox Code Playgroud)