让我首先介绍一个示例数据.
set.seed(1)
x1=rnorm(10)
y=as.factor(sample(c(1,0),10,replace=TRUE))
x2=sample(c('Young','Middle','Old'),10,replace=TRUE)
model1 <- glm(y~as.factor(x1>=0)+as.factor(x2),binomial)
Run Code Online (Sandbox Code Playgroud)
当我进入时summary(model1),我明白了
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.1835 1.0926 -0.168 0.867
as.factor(x1 >= 0)TRUE 0.7470 1.7287 0.432 0.666
as.factor(x2)Old 0.7470 1.7287 0.432 0.666
as.factor(x2)Young 18.0026 4612.2023 0.004 0.997
Run Code Online (Sandbox Code Playgroud)
现在请忽略模型估计,因为数据是假的
在R中是否有办法更改出现在最左侧列上的估计值的名称,以使它们看起来更清晰?例如,删除as.factor,并_在因子级别之前放置一个.输出应如下:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.1835 1.0926 -0.168 0.867
(x1 >= 0)_TRUE 0.7470 1.7287 0.432 0.666
(x2)_Old 0.7470 1.7287 0.432 0.666
(x2)_Young 18.0026 4612.2023 0.004 0.997
Run Code Online (Sandbox Code Playgroud)
除了上面的注释之外,另一部分是将所有数据放在数据框中,并相应地命名变量.然后变量名称不是从一个塞满你的公式的丑陋表达中获取的:
library(car)
dat <- data.frame(y = y,
x1 = cut(x1,breaks = c(-Inf,0,Inf),labels = c("x1 < 0","x1 >= 0"),right = FALSE),
x2 = as.factor(x2))
#To illustrate Brian's suggestion above
options(decorate.contr.Treatment = "")
model1 <- glm(y~x1+x2,binomial,data = dat,
contrasts = list(x1 = "contr.Treatment",x2 = "contr.Treatment"))
summary(model1)
Call:
glm(formula = y ~ x1 + x2, family = binomial, data = dat, contrasts = list(x1 = "contr.Treatment",
x2 = "contr.Treatment"))
Deviance Residuals:
Min 1Q Median 3Q Max
-1.7602 -0.8254 0.3456 0.8848 1.2563
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.1835 1.0926 -0.168 0.867
x1[x1 >= 0] 0.7470 1.7287 0.432 0.666
x2[Old] 0.7470 1.7287 0.432 0.666
x2[Young] 18.0026 4612.2023 0.004 0.997
Run Code Online (Sandbox Code Playgroud)