R中加权数据的频率表

use*_*648 14 r frequency-distribution weighted

我需要按年龄和婚姻状况计算个人的频率,所以通常我会使用:

    table(age, marital_status)
Run Code Online (Sandbox Code Playgroud)

然而,每个人在采样数据后具有不同的权重.如何将其合并到我的频率表中?

Vic*_*orp 15

您可以使用函数svytable从包survey,或wtd.tablergrs.

编辑: rgrs现在叫questionr:

df <- data.frame(var = c("A", "A", "B", "B"), wt = c(30, 10, 20, 40))

library(questionr)
wtd.table(x = df$var, weights = df$wt)
#  A  B 
# 40 60
Run Code Online (Sandbox Code Playgroud)

这也是可能的dplyr:

library(dplyr)
count(x = df, var, wt = wt)
# # A tibble: 2 x 2
#        var     n
#     <fctr> <dbl>
#   1      A    40
#   2      B    60
Run Code Online (Sandbox Code Playgroud)


Sic*_*abí 6

只是为了完整起见,使用基础 R:

df <- data.frame(var = c("A", "A", "B", "B"), wt = c(30, 10, 20, 40))

aggregate(x = list("wt" = df$wt), by = list("var" = df$var), FUN = sum)
Run Code Online (Sandbox Code Playgroud)

var wt
1 A 40
2 B 60

或者使用不那么麻烦的公式符号:

aggregate(wt ~ var, data = df, FUN = sum)
Run Code Online (Sandbox Code Playgroud)

var wt
1 A 40
2 B 60