扩展数据框以获得R中所有独特的catogorical列值的每月收入总和

Ali*_*Zia 5 r data.table dcast

我有一个df,其数据如下:

sub = c("X001","X002", "X001","X003","X002","X001","X001","X003","X002","X003","X003","X002") 
month = c("201506", "201507", "201506","201507","201507","201508", "201508","201507","201508","201508", "201508", "201508") 
tech = c("mobile", "tablet", "PC","mobile","mobile","tablet", "PC","tablet","PC","PC", "mobile", "tablet") 
brand = c("apple", "samsung", "dell","apple","samsung","apple", "samsung","dell","samsung","dell", "dell", "dell")

revenue = c(20, 15, 10,25,20,20, 17,9,14,12, 9, 11)

df = data.frame(sub, month, brand, tech, revenue)
Run Code Online (Sandbox Code Playgroud)

我想使用sub和month作为密钥,每个订户每月获得一行,显示该月份该订户的技术和品牌的唯一值的收入总和.这个例子很简单,列数较少,因为我有一个巨大的数据集,我决定尝试这样做data.table.

我已经设法为一个catagorical列做了这个,使用这个:技术或品牌:

df1 <- dcast(df, sub + month ~ tech,  fun=sum, value.var = "revenue")
Run Code Online (Sandbox Code Playgroud)

但我想为两个或更多的caqtogorical列做这个,到目前为止我已经尝试过这个:

df2 <- dcast(df, sub + month ~ tech+brand,  fun=sum, value.var = "revenue")
Run Code Online (Sandbox Code Playgroud)

它只是连接了catogorical列的唯一值和总和,但我不希望这样.我想为所有catogorical列的每个独特值分隔列.

我是R的新手,非常感谢任何帮助.

Dav*_*urg 5

(我会假设在你的例子中这dfdata.table一个data.frame相似的东西.)

对此的一种可能解决方案是melt在保持时将数据作为键sub,monthrevenue作为键.这样,brand并且tech将被转换为具有与每个现有键组合相对应的值的单个变量.通过这种方式,我们将能够轻松地dcast返回,因为我们将在第一个示例中针对单个列进行操作

dcast(melt(df, c(1:2, 5)), sub + month ~ value, sum, value.var = "revenue")
#     sub  month PC apple dell mobile samsung tablet
# 1: X001 201506 10    20   10     20       0      0
# 2: X001 201508 17    20    0      0      17     20
# 3: X002 201507  0     0    0     20      35     15
# 4: X002 201508 14     0   11      0      14     11
# 5: X003 201507  0    25    9     25       0      9
# 6: X003 201508 12     0   21      9       0      0
Run Code Online (Sandbox Code Playgroud)

根据OP注释,您可以通过添加variable到公式来轻松添加前缀.这样,列也将正确排序

dcast(melt(df, c(1:2, 5)), sub + month ~ variable + value, sum, value.var = "revenue")
#     sub  month brand_apple brand_dell brand_samsung tech_PC tech_mobile tech_tablet
# 1: X001 201506          20         10             0      10          20           0
# 2: X001 201508          20          0            17      17           0          20
# 3: X002 201507           0          0            35       0          20          15
# 4: X002 201508           0         11            14      14           0          11
# 5: X003 201507          25          9             0       0          25           9
# 6: X003 201508           0         21             0      12           9           0
Run Code Online (Sandbox Code Playgroud)