有效使用R data.table和unique()

Question

有效使用R data.table和unique()

是否有比以下更有效的查询

DT[, list(length(unique(OrderNo)) ),customerID]

Run Code Online (Sandbox Code Playgroud)

使用客户ID,订单号和产品系列项细化LONG格式表,这意味着如果客户在该交易中购买了多个项目,则会有重复的行具有相同的订单ID.

试图找出独特的购买方式.length()按客户ID计算所有订单ID,包括重复项,仅查找唯一编号.

从这里编辑:

这是一些虚拟代码.理想情况下,我正在寻找的是使用第一个查询的输出unique().

df <- data.frame(
             customerID=as.factor(c(rep("A",3),rep("B",4))),
             product=as.factor(c(rep("widget",2),rep("otherstuff",5))),
             orderID=as.factor(c("xyz","xyz","abd","qwe","rty","yui","poi")),
             OrderDate=as.Date(c("2013-07-01","2013-07-01","2013-07-03","2013-06-01","2013-06-02","2013-06-03","2013-07-01"))
             )

DT.eg <- as.data.table(df)
#Gives unique order counts
DT.eg[, list(orderlength = length(unique(orderID)) ),customerID]
#Gives counts of all orders by customer
DT.eg[,.SD, keyby=list(orderID, customerID)][, .N, by=customerID]

         ^
         |
  This should be .N, not .SD  ~ R.S.

Run Code Online (Sandbox Code Playgroud)

Answer 1

Ric*_*rta 12

如果您要计算每位客户的独特购买数量,请使用

 DT[, .N, keyby=list(customerId, OrderNo)][, .N, by=customerId]

Run Code Online (Sandbox Code Playgroud)

归档时间：	12 年，3 月前
查看次数：	3522 次
最近记录：	8 年，2 月前