在stats :: glm()中,为什么子集参数会给我自己的数据参数子集提供不同的结果？

Question

在stats :: glm()中,为什么子集参数会给我自己的数据参数子集提供不同的结果？

请考虑以下代码:

library(ISLR)

row_list <- structure(list(`1` = 1:40, `2` = 41:79, `3` = 80:118, `4` = 119:157, 
               `5` = 158:196, `6` = 197:235, `7` = 236:274, `8` = 275:313, 
               `9` = 314:352, `10` = 353:392), 
          .Names = c("1", "2", "3", "4", "5", "6", "7", "8", "9", "10"))
test <- row_list[[1]]
train <- setdiff(unlist(row_list), row_list[[1]])

Run Code Online (Sandbox Code Playgroud)

输出1:

> glm(mpg ~ poly(horsepower, 1), data = Auto, subset = train)

Call:  glm(formula = mpg ~ poly(horsepower, 1), data = Auto, subset = train)

Coefficients:
        (Intercept)  poly(horsepower, 1)  
              23.37              -133.05  

Degrees of Freedom: 351 Total (i.e. Null);  350 Residual
Null Deviance:      21460 
Residual Deviance: 8421     AIC: 2122

Run Code Online (Sandbox Code Playgroud)

输出2:

> glm(mpg ~ poly(horsepower, 1), data = Auto[train,])

Call:  glm(formula = mpg ~ poly(horsepower, 1), data = Auto[train, ])

Coefficients:
        (Intercept)  poly(horsepower, 1)  
              24.05              -114.19  

Degrees of Freedom: 351 Total (i.e. Null);  350 Residual
Null Deviance:      21460 
Residual Deviance: 8421     AIC: 2122

Run Code Online (Sandbox Code Playgroud)

从上面可以看出,两个输出之间的(Intercept)和poly(horsepower, 1)值不同.为什么是这样？

至少lm(),介绍统计学习建议(见第191页),该行的索引可以在使用subset的说法.是不是这种情况glm(),或者subset只是没有正确使用？

Answer 1

Jam*_*mes 7

这与正交多项式的构造方式有关poly.

在第一个示例中,它们是在子集化之前构造的,而在第二个示例中,首先进行子集化(当您将子集化数据传递给它时glm).

使用原始多项式可得到相同的结果:

coef(glm(mpg~poly(hp,1),data=mtcars,subset=10:32))
(Intercept) poly(hp, 1) 
   20.63307   -28.66876 
coef(glm(mpg~poly(hp,1),data=mtcars[10:32,]))
(Intercept) poly(hp, 1) 
   19.93043   -25.43935 
coef(glm(mpg~poly(hp,1,raw=TRUE),data=mtcars,subset=10:32))
            (Intercept) poly(hp, 1, raw = TRUE) 
            31.64927851             -0.07509986 
coef(glm(mpg~poly(hp,1,raw=TRUE),data=mtcars[10:32,]))
            (Intercept) poly(hp, 1, raw = TRUE) 
            31.64927851             -0.07509986

Run Code Online (Sandbox Code Playgroud)

归档时间：	8 年，4 月前
查看次数：	285 次
最近记录：	8 年，4 月前