R循环回归

bvo*_*owe 1 performance r memory-efficient

data=mtcars
data$group = rep(seq(from=1, to=4, by=1), 8)


model1 <- glm(vs ~ mpg + cyl + disp + hp, data = subset(data, group == 1), family = "binomial")
model2 <- glm(vs ~ mpg + cyl + disp + hp, data = subset(data, group == 2), family = "binomial")
model3 <- glm(vs ~ mpg + cyl + disp + hp, data = subset(data, group == 3), family = "binomial")
model4 <- glm(vs ~ mpg + cyl + disp + hp, data = subset(data, group == 4), family = "binomial")

model5 <- glm(am ~ mpg + cyl + disp + hp, data = subset(data, group == 1), family = "binomial")
model6 <- glm(am ~ mpg + cyl + disp + hp, data = subset(data, group == 2), family = "binomial")
model7 <- glm(am ~ mpg + cyl + disp + hp, data = subset(data, group == 3), family = "binomial")
model8 <- glm(am ~ mpg + cyl + disp + hp, data = subset(data, group == 4), family = "binomial")
Run Code Online (Sandbox Code Playgroud)

假设您要估计一堆分层模型,除了分层组(模型1-4)外,其他所有方式都相同,并且您想针对不同结果重复此系列模型(模型5-8)。

这就是我上面的代码。但是,是否有一种更有效的方式来运行它,而不用占用那么多代码行呢?例如,指定协变量,结果和组,然后遍历它们?

tal*_*lat 6

例如,您可以用于data.table按组运行模型拟合,例如:

library(data.table)
dt = as.data.table(data)

models = dt[, .(fit_vs = list(glm(vs ~ mpg + cyl + disp + hp, family = "binomial")),
                fit_am = list(glm(am ~ mpg + cyl + disp + hp, family = "binomial"))), 
            by = .(group)]
Run Code Online (Sandbox Code Playgroud)

结果是:

print(models)
#    group fit_vs fit_am
# 1:     2  <glm>  <glm>
# 2:     1  <glm>  <glm>
# 3:     3  <glm>  <glm>
# 4:     4  <glm>  <glm>
Run Code Online (Sandbox Code Playgroud)

您可以使用以下方式访问适合vs和分组3:

models[group == "3", fit_vs]
# [[1]]
# 
# Call:  glm(formula = vs ~ mpg + cyl + disp + hp, family = "binomial")
# 
# Coefficients:
#   (Intercept)          mpg          cyl         disp           hp  
# 180.970664    -0.384760   -24.366394    -0.008435    -0.010799  
# 
# Degrees of Freedom: 9 Total (i.e. Null);  5 Residual
# Null Deviance:        13.46 
# Residual Deviance: 3.967e-10  AIC: 10
Run Code Online (Sandbox Code Playgroud)