我有一个"日期"向量,其中包含以mm/dd/yyyy格式表示的日期:
head(Entered_Date,5)
[1] 1/5/1998 1/5/1998 1/5/1998 1/5/1998 1/5/1998
Run Code Online (Sandbox Code Playgroud)
我试图根据日期绘制频率变量,但我想按月或年分组日期.就像现在一样,每天有一个频率,但我想按月或年绘制频率.因此,对于1/5/1998,1为1/7/1998和3为1/8/1998,频率为1,我想将其显示为1为1/1998.这是一个相对较大的数据集,从1998年到现在的日期,我想找到一些自动化的方法来实现这一目标.
> dput(head(Entered_Date))
structure(c(260L, 260L, 260L, 260L, 260L, 260L), .Label = c("1/1/1998",
"1/1/1999", "1/1/2001", "1/1/2002", "1/10/2000", "1/10/2001",
"1/10/2002", "1/10/2003", "1/10/2005", "1/10/2006", "1/10/2007",
"1/10/2008", "1/10/2011", "1/10/2012", "1/10/2013", "1/11/1999",
"1/11/2000", "1/11/2001", "1/11/2002", "1/11/2005", "1/11/2006",
"1/11/2008", "1/11/2010", "1/11/2011", "1/11/2012", "1/11/2013",
"1/12/1998", "1/12/1999", "1/12/2001", "1/12/2004", "1/12/2005", ...
Run Code Online (Sandbox Code Playgroud)
cde*_*man 24
这是一个使用的例子dplyr.您只需在语句中使用月份%m或年份的相应日期格式字符串.%Yformat
set.seed(123)
df <- data.frame(date = seq.Date(from =as.Date("01/01/1998", "%d/%m/%Y"),
to=as.Date("01/01/2000", "%d/%m/%Y"), by="day"),
value = sample(seq(5), 731, replace = TRUE))
head(df)
date value
1 1998-01-01 2
2 1998-01-02 4
3 1998-01-03 3
4 1998-01-04 5
5 1998-01-05 5
6 1998-01-06 1
library(dplyr)
df %>%
mutate(month = format(date, "%m"), year = format(date, "%Y")) %>%
group_by(month, year) %>%
summarise(total = sum(value))
Source: local data frame [25 x 3]
Groups: month [?]
month year total
(chr) (chr) (int)
1 01 1998 105
2 01 1999 91
3 01 2000 3
4 02 1998 74
5 02 1999 77
6 03 1998 96
7 03 1999 86
8 04 1998 91
9 04 1999 95
10 05 1998 93
.. ... ... ...
Run Code Online (Sandbox Code Playgroud)
小智 12
lubridate的floor_date很好地做到了这一点。
data %>%
group_by(month=floor_date(date, "month")) %>%
summarize(summary_variable=sum(value))
Run Code Online (Sandbox Code Playgroud)
感谢Roman Cheplyaka
https://ro-che.info/articles/2017-02-22-group_by_month_r
只需添加到 @cdeterman 答案,您可以使用lubridatewith 来dplyr使这变得更加容易:
df <- data.frame(date = seq.Date(from =as.Date("01/01/1998", "%d/%m/%Y"),
to=as.Date("01/01/2000", "%d/%m/%Y"), by="day"),
value = sample(seq(5), 731, replace = TRUE))
library(dplyr)
library(lubridate)
df %>%
mutate(month = month(date), year = year(date)) %>%
group_by(month, year) %>%
summarise(total = sum(value))
Run Code Online (Sandbox Code Playgroud)