use*_*648 14 r frequency-distribution weighted
我需要按年龄和婚姻状况计算个人的频率,所以通常我会使用:
table(age, marital_status)
Run Code Online (Sandbox Code Playgroud)
然而,每个人在采样数据后具有不同的权重.如何将其合并到我的频率表中?
Vic*_*orp 15
您可以使用函数svytable从包survey,或wtd.table从rgrs.
编辑: rgrs现在叫questionr:
df <- data.frame(var = c("A", "A", "B", "B"), wt = c(30, 10, 20, 40))
library(questionr)
wtd.table(x = df$var, weights = df$wt)
# A B
# 40 60
Run Code Online (Sandbox Code Playgroud)
这也是可能的dplyr:
library(dplyr)
count(x = df, var, wt = wt)
# # A tibble: 2 x 2
# var n
# <fctr> <dbl>
# 1 A 40
# 2 B 60
Run Code Online (Sandbox Code Playgroud)
只是为了完整起见,使用基础 R:
df <- data.frame(var = c("A", "A", "B", "B"), wt = c(30, 10, 20, 40))
aggregate(x = list("wt" = df$wt), by = list("var" = df$var), FUN = sum)
Run Code Online (Sandbox Code Playgroud)
var wt
1 A 40
2 B 60
或者使用不那么麻烦的公式符号:
aggregate(wt ~ var, data = df, FUN = sum)
Run Code Online (Sandbox Code Playgroud)
var wt
1 A 40
2 B 60
| 归档时间: |
|
| 查看次数: |
12997 次 |
| 最近记录: |