使用 python 获取线性回归中的不确定性值

Question

使用 python 获取线性回归中的不确定性值

Eri*_* Na 5 python regression google-visualization

我有一些数据，例如

arr = [
    [30.0, 0.0257],
    [30.0, 0.0261],
    [30.0, 0.0261],
    [30.0, 0.026],
    [30.0, 0.026],
    [35.0, 0.0387],
    [35.0, 0.0388],
    [35.0, 0.0387],
    [35.0, 0.0388],
    [35.0, 0.0388],
    [40.0, 0.0502],
    [40.0, 0.0503],
    [40.0, 0.0502],
    [40.0, 0.0498],
    [40.0, 0.0502],
    [45.0, 0.0582],
    [45.0, 0.0574],
    [45.0, 0.058],
    [45.0, 0.058],
    [45.0, 0.058],
    [50.0, 0.0702],
    [50.0, 0.0702],
    [50.0, 0.0698],
    [50.0, 0.0704],
    [50.0, 0.0703],
    [55.0, 0.0796],
    [55.0, 0.0808],
    [55.0, 0.0803],
    [55.0, 0.0805],
    [55.0, 0.0806],
]

Run Code Online (Sandbox Code Playgroud)

绘制如下

在Google Charts API

我正在尝试对此进行线性回归，即尝试找到趋势线的斜率和（y-）截距，以及斜率的不确定性和截距的不确定性。

当我绘制趋势线时已经找到了斜率Google Charts API和截距值，但我不知道如何找到不确定性。

我一直LINEST在使用中的函数来执行此操作Excel，但我发现这非常麻烦，因为我的所有数据都在Python.

所以我的问题是，如何找到我使用时获得的两个不确定性LINEST值Python？

我很抱歉问这样一个基本问题。

我很擅长Python和Javascript，但我在回归分析方面很差，所以当我尝试在文档中查找它们时，由于术语很困难，我感到非常困惑。

我希望使用一些知名的Python库，尽管如果我能在Google Charts API.

Answer 1

phi*_*ler 3

可以使用statsmodels来完成，如下所示：

import statsmodels.api as sm
import numpy as np


y=[];x=[]
for item in arr:
    x.append(item[0])
    y.append(item[1])

# include constant in ols models, which is not done by default
x = sm.add_constant(x)

model = sm.OLS(y,x)
results = model.fit()

Run Code Online (Sandbox Code Playgroud)

然后您可以按如下方式访问所需的值。截距和斜率由下式给出：

results.params # linear coefficients
# array([-0.036924 ,  0.0021368])

Run Code Online (Sandbox Code Playgroud)

我想你指的是不确定性时的标准误差，它们可以这样访问：

results.bse # standard errors of the parameter estimates
# array([  1.03372221e-03,   2.38463106e-05])

Run Code Online (Sandbox Code Playgroud)

可以通过运行获得概述

>>> print results.summary()
                            OLS Regression Results
==============================================================================
Dep. Variable:                      y   R-squared:                       0.997
Model:                            OLS   Adj. R-squared:                  0.996
Method:                 Least Squares   F-statistic:                     8029.
Date:                Fri, 26 Sep 2014   Prob (F-statistic):           5.61e-36
Time:                        05:47:08   Log-Likelihood:                 162.43
No. Observations:                  30   AIC:                            -320.9
Df Residuals:                      28   BIC:                            -318.0
Df Model:                           1
Covariance Type:            nonrobust
==============================================================================
                 coef    std err          t      P>|t|      [95.0% Conf. Int.]
------------------------------------------------------------------------------
const         -0.0369      0.001    -35.719      0.000        -0.039    -0.035
x1             0.0021   2.38e-05     89.607      0.000         0.002     0.002
==============================================================================
Omnibus:                        7.378   Durbin-Watson:                   0.569
Prob(Omnibus):                  0.025   Jarque-Bera (JB):                2.079
Skew:                           0.048   Prob(JB):                        0.354
Kurtosis:                       1.714   Cond. No.                         220.
==============================================================================

Warnings:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.

Run Code Online (Sandbox Code Playgroud)

这对于生成模型的属性的总结也可能有意义。

我没有LINEST在Excel中进行比较。我也不知道仅使用 Google Charts API 是否可行。

归档时间：	11 年，1 月前
查看次数：	8986 次
最近记录：	11 年，1 月前