Jef*_*eff 3 r dataframe data.table
我有一个像这样的数据框架
product_id view_count purchase_count
1 11 1
2 20 3
3 5 2
...
Run Code Online (Sandbox Code Playgroud)
我想将其转换为一个表,该表按view_count进行分组,并将purchase_count与一个时间间隔相加.
view_count_range total_purchase_count
0-10 45
10-20 65
Run Code Online (Sandbox Code Playgroud)
这些view_count_ranges将具有固定大小.我很感激有关如何对这样的范围进行分组的任何建议.
cut这是一种方便的工具.这是一种方式:
#First make some data to work with
#I suggest you do this in the future as it makes it
#easier to provide you with assistance.
set.seed(10)
dat <- data.frame(product_id=1:15, view_count=sample(1:20, 15, replace=T),
purchase_count=sample(1:8, 15, replace=T))
dat #look at the data
#now we can use cut and aggregate by this new variable we just created
dat$view_count_range <- with(dat, cut(view_count, c(0, 10, 20)))
aggregate(purchase_count~view_count_range, dat, sum)
Run Code Online (Sandbox Code Playgroud)
产量:
view_count_range purchase_count
1 (0,10] 39
2 (10,20] 31
Run Code Online (Sandbox Code Playgroud)