我一直在尝试通过使用scikit-learn的LR获取标准误差和p值。但是没有成功。
我最终找到了这篇文章:但是std错误和p值与statsmodel.api OLS方法中的不匹配
import numpy as np
from sklearn import datasets
from sklearn import linear_model
import regressor
import statsmodels.api as sm
boston = datasets.load_boston()
which_betas = np.ones(13, dtype=bool)
which_betas[3] = False
X = boston.data[:,which_betas]
y = boston.target
#scikit + regressor stats
ols = linear_model.LinearRegression()
ols.fit(X,y)
xlables = boston.feature_names[which_betas]
regressor.summary(ols, X, y, xlables)
# statsmodel
x2 = sm.add_constant(X)
models = sm.OLS(y,x2)
result = models.fit()
print result.summary()
Run Code Online (Sandbox Code Playgroud)
输出如下:
Residuals:
Min 1Q Median 3Q Max
-26.3743 -1.9207 0.6648 2.8112 13.3794
Coefficients: …Run Code Online (Sandbox Code Playgroud)