我正在尝试重现glmR中二项式的结果
考虑来自http://www.ats.ucla.edu/stat/r/dae/logit.htm的数据
mydata <- read.csv("http://www.ats.ucla.edu/stat/data/binary.csv")
Run Code Online (Sandbox Code Playgroud)
我可以使用以下模型拟合模型:
model <- glm(admit ~ gre + gpa + rank, data = mydata, family = "binomial")
Run Code Online (Sandbox Code Playgroud)
并且,仅使用对象重现模型:
model_r <- glm(as.numeric(model$y)~0+model.matrix(model), family = binomial)
cbind(coef(model), coef(model_r))
## [,1] [,2]
## (Intercept) -3.44954840 -3.44954840
## gre 0.00229396 0.00229396
## gpa 0.77701357 0.77701357
## rank -0.56003139 -0.56003139
Run Code Online (Sandbox Code Playgroud)
现在假设该列admit是该列中许多文章的成功数n:
mydata$n <- 1 + rbinom(n = 400, size = 2, prob = 0.5)
Run Code Online (Sandbox Code Playgroud)
现在我必须使用以下模型:
model <- glm(cbind(admit, n-admit) ~ gre + gpa + rank, data = mydata,
family = "binomial")
Run Code Online (Sandbox Code Playgroud)
如何使用模型对象重现此模型?我问这个因为R只保留了成功率model$y.
您可以通过以下方式重现模型:
model_r <- glm(formula(model), model$data, family = family(model))
Run Code Online (Sandbox Code Playgroud)
相比:
cbind(coef(model), coef(model_r))
# [,1] [,2]
# (Intercept) -3.693688931 -3.693688931
# gre 0.001855502 0.001855502
# gpa 0.584915067 0.584915067
# rank -0.450051862 -0.450051862
Run Code Online (Sandbox Code Playgroud)
或者(类似于您的方法):
model_r2 <- glm(model.frame(model)[[1]] ~ 0 + model.matrix(model),
family = family(model))
cbind(coef(model), coef(model_r2))
# [,1] [,2]
# (Intercept) -3.693688931 -3.693688931
# gre 0.001855502 0.001855502
# gpa 0.584915067 0.584915067
# rank -0.450051862 -0.450051862
Run Code Online (Sandbox Code Playgroud)