我有一个在线购物平台的订单数据库.
我正在使用的表格如下所示,其中每一行对应一个客户/项目/日期.
OrderHistory <- data.frame(date=c("2015-02-01", "2015-03-01", "2015-04-01", "2015-03-01", "2015-04-01", "2015-05-01", "2015-05-01"),
customer=c("A","A","A","B","B","B","B"),
item=c("Candy", "Coffee", "Coffee", "Candy", "Candy", "Candy", "Coffee" ))
Run Code Online (Sandbox Code Playgroud)
我想得到的是每个成员订购特定物品的次数的运行计数,因此我可以分析哪些物品是由相同的客户重复订购的,哪些物品是一次订购而不是再次订购.
输出看起来像
out <- data.frame(date=c("2015-02-01", "2015-03-01", "2015-04-01", "2015-03-01", "2015-04-01", "2015-05-01", "2015-05-01"),
member=c("A","A","A","B","B","B","B"),
item=c("Candy", "Coffee", "Coffee", "Candy", "Candy", "Candy", "Coffee" ),
count=c(1,1,2,1,2,3,1))
Run Code Online (Sandbox Code Playgroud)
我想要一个dplyr解决方案,但我愿意接受任何建议!平台上的确切项目在不断变化,因此解决方案必须是动态的才能解决这个问题.
cde*_*man 14
我相信这应该给你你想要的东西
library(dplyr)
OrderHistory %>%
group_by(customer, item) %>%
mutate(count = seq(n()))
Source: local data frame [7 x 4]
Groups: customer, item
date customer item count
1 2015-02-01 A Candy 1
2 2015-03-01 A Coffee 1
3 2015-04-01 A Coffee 2
4 2015-03-01 B Candy 1
5 2015-04-01 B Candy 2
6 2015-05-01 B Candy 3
7 2015-05-01 B Coffee 1
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
4520 次 |
| 最近记录: |