我有一个dat关于汽车销售(Buy=0在数据框中)和购买Buy=1二手车销售商(在数据框中)的数据框架.
Date Buy Price
29-06-2015 1 5000
29-06-2015 0 8000
29-06-2015 1 10000
30-06-2015 0 3500
30-06-2015 0 12000
... ... ...
Run Code Online (Sandbox Code Playgroud)
我需要的是一个新的,汇总的data.frame,它给我每天的购买数量和销售数量,以及当天所有购买和销售的总价格:
Date Buys Sells Price_Buys Price_Sells
29-06-2015 2 1 15000 8000
30-06-2015 0 2 0 15500
... ... ...
Run Code Online (Sandbox Code Playgroud)
我试着用aggregate(dat$Buy, by=list(Date=dat$Date, FUN=sum)).但是,我仍然在努力如何汇总销售.
这可以非常干净地完成dplyr,按日期分组使用group_by,然后总结summarize:
library(dplyr)
(out <- dat %>%
group_by(Date) %>%
summarize(Buys=sum(Buy == 1), Sells=sum(Buy == 0),
Price_Buys=sum(Price[Buy == 1]), Price_Sells=sum(Price[Buy == 0])))
# Date Buys Sells Price_Buys Price_Sells
# (fctr) (int) (int) (int) (int)
# 1 29-06-2015 2 1 15000 8000
# 2 30-06-2015 0 2 0 15500
Run Code Online (Sandbox Code Playgroud)
您现在可以像处理普通数据框一样操纵此对象,例如:
out$newvar <- with(out, Sells*Price_Sells - Buys*Price_Buys)
out
# Source: local data frame [2 x 6]
# Date Buys Sells Price_Buys Price_Sells newvar
# (fctr) (int) (int) (int) (int) (int)
# 1 29-06-2015 2 1 15000 8000 -22000
# 2 30-06-2015 0 2 0 15500 31000
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
1038 次 |
| 最近记录: |