我想创建一个虚拟变量,如果在两个或多个不同年龄组中观察到个体,则取值为1,否则为0.
有人能够做到这一点,并能解释给我吗?
一个小例子可能是:
set.seed(123)
df <- data.frame(id = sample(1:10, 30, replace = TRUE),
agegroup = sample(c("5054", "5559", "6065"), 30, replace = TRUE))
Run Code Online (Sandbox Code Playgroud)
并预期产量:
id agegroup dummy
3 6065 1
8 6065 1
5 6065 1
9 6065 1
10 5054 1
1 5559 0
6 6065 1
9 5054 1
6 5054 1
5 5054 1
10 5054 1
5 5559 1
7 5559 1
6 5559 1
2 5054 1
9 5054 1
3 5054 1
1 5559 0 …Run Code Online (Sandbox Code Playgroud) 我正在通过包 MatchIt 运行粗化精确匹配 (CEM) 作为预处理步骤,并希望在进一步分析中使用匹配的数据。作为测试,我使用包 cem 运行 CEM,并注意到不平衡度量与通过 MatchIt 包的不平衡度量不同。例如,使用 LaLonde 数据集:
library(MatchIt)
library(cem)
data(LL)
re74cut <- seq(0, 40000, 5000)
re75cut <- seq(0, max(LL$re75)+1000, by=1000)
agecut <- c(20.5, 25.5, 30.5,35.5,40.5)
my.cutpoints <- list(re75=re75cut, re74=re74cut, age=agecut)
matchit.match <- matchit(treated ~ age + education + black + married + nodegree +
re74 + re75 + hispanic + u74 + u75,
data = LL,
method = "cem",
cutpoints = my.cutpoints)
matchit.data <- match.data(matchit.match)
matchit.imb <- imbalance(group=matchit.data$treated,
data=matchit.data,
drop=c("treated","re78","distance",
"weights","subclass"))
cem.match <- cem(treatment = …Run Code Online (Sandbox Code Playgroud)