使用 data.table 计算每个子组的比例

Tom*_*Tom 3 r mean data.table

对于以下简单数据集;

   row  country year
     1  NLD     2005
     2  NLD     2005       
     3  BLG     2006
     4  BLG     2005
     5  GER     2005
     6  NLD     2007
     7  NLD     2005
     8  NLD     2008
Run Code Online (Sandbox Code Playgroud)

下面的代码:

df[, .N, by = list(country, year)][,prop := N/sum(N)]
Run Code Online (Sandbox Code Playgroud)

给出观测值占观测值总数的比例。然而我想要的是衡量每个国家的比例。我应该如何调整这段代码才能给出正确的比例?

期望的输出:

   row  country year  prop
     1  NLD     2005   0.6
     2  NLD     2005   0.6    
     3  BLG     2006   0.5
     4  BLG     2005   0.5
     5  GER     2005   1
     6  NLD     2007   0.2
     7  NLD     2005   0.6  
     8  NLD     2008   0.2
Run Code Online (Sandbox Code Playgroud)

sam*_*dhi 5

使用data.table

df <- read.table(header = T, text = "row  country year
     1  NLD     2005
                 2  NLD     2005       
                 3  BLG     2006
                 4  BLG     2005
                 5  GER     2005
                 6  NLD     2007
                 7  NLD     2005
                 8  NLD     2008")

setDT(df)[, sum := .N, by = country][, prop := .N, by = c("country", "year")][, prop := prop/sum][, sum := NULL]


    row country year prop
1:   1     NLD 2005  0.6
2:   2     NLD 2005  0.6
3:   3     BLG 2006  0.5
4:   4     BLG 2005  0.5
5:   5     GER 2005  1.0
6:   6     NLD 2007  0.2
7:   7     NLD 2005  0.6
8:   8     NLD 2008  0.2
Run Code Online (Sandbox Code Playgroud)