php*_*ash 4 r linear-regression
我想知道是否有一种方法可以为线性回归模型包括误差项:
r = lm(y ~ x1+x2)
Run Code Online (Sandbox Code Playgroud)
该代码r = lm(y ~ x1+x2)意味着我们将y建模为x1和x2的线性函数。由于模型不是完美的,因此会有一个残差项(即模型无法拟合的剩余项)。
在数学,如罗布海德门评价所指出的,y = a + b1*x1 + b2*x2 + e其中a,b1和b2是常数,e是你的残差(其被假定为正态分布)。
要看一个具体的示例,请考虑R随附的虹膜数据。
model1 <- lm(Sepal.Length ~ Sepal.Width + Petal.Length + Petal.Width, data=iris)
Run Code Online (Sandbox Code Playgroud)
现在,我们可以提取模型中的常数(相当于a,b1,b2在这种情况下b3也是如此)。
> coefficients(model1)
(Intercept) Sepal.Width Petal.Length Petal.Width
1.8559975 0.6508372 0.7091320 -0.5564827
Run Code Online (Sandbox Code Playgroud)
已为模型中使用的每一行数据计算了残差。
> residuals(model1)
1 2 3 4 5
0.0845842387 0.2100028184 -0.0492514176 -0.2259940935 -0.0804994772
# etc. There are 150 residuals and 150 rows in the iris dataset.
Run Code Online (Sandbox Code Playgroud)
(编辑:将摘要信息剪切为不相关。)
编辑:
Error在aov的帮助页面上说明了您在注释中提到的值。
If the formula contains a single ‘Error’ term, this is used to
specify error strata, and appropriate models are fitted within
each error stratum.
Run Code Online (Sandbox Code Playgroud)
比较以下内容(从?aov页面改编。)
> utils::data(npk, package="MASS")
> aov(yield ~ N*P*K, npk)
Call:
aov(formula = yield ~ N * P * K, data = npk)
Terms:
N P K N:P N:K P:K N:P:K Residuals
Sum of Squares 189.2817 8.4017 95.2017 21.2817 33.1350 0.4817 37.0017 491.5800
Deg. of Freedom 1 1 1 1 1 1 1 16
Residual standard error: 5.542901
Estimated effects may be unbalanced
> aov(yield ~ N*P*K + Error(block), npk)
Call:
aov(formula = yield ~ N * P * K + Error(block), data = npk)
Grand Mean: 54.875
Stratum 1: block
Terms:
N:P:K Residuals
Sum of Squares 37.00167 306.29333
Deg. of Freedom 1 4
Residual standard error: 8.750619
Estimated effects are balanced
Stratum 2: Within
Terms:
N P K N:P N:K P:K Residuals
Sum of Squares 189.28167 8.40167 95.20167 21.28167 33.13500 0.48167 185.28667
Deg. of Freedom 1 1 1 1 1 1 12
Residual standard error: 3.929447
1 out of 7 effects not estimable
Estimated effects may be unbalanced
Run Code Online (Sandbox Code Playgroud)