使用NaN的Python线性回归

use*_*760 1 python linear-regression

values=([0,2,1,'NaN',6],[4,4,7,6,7],[9,7,8,9,10])
time=[0,1,2,3,4]
slope_1 = stats.linregress(time,values[1]) # This works
slope_0 = stats.linregress(time,values[0]) # This doesn't work
Run Code Online (Sandbox Code Playgroud)

有没有办法忽略NaN并对剩余值进行线性回归?

非常感谢。

-gv

Jef*_*eff 5

是的,您可以使用statsmodels执行此操作:

import statsmodels.api as sm
from numpy import NaN

x = [0, 2, NaN, 4, 5, 6, 7, 8]
y = [1, 3, 4,   5, 6, 7, 8, 9]

model = sm.OLS(y, x, missing='drop')
results = model.fit()

In [2]: results.params
Out[2]: array([ 1.16494845])
Run Code Online (Sandbox Code Playgroud)

与仅删除缺少数据的行所得到的结果相同:

x = [0, 2, 4, 5, 6, 7, 8]
y = [1, 3, 5, 6, 7, 8, 9]

model = sm.OLS(y, x)
results = model.fit()

In [4]: results.params
Out[4]: array([ 1.16494845])
Run Code Online (Sandbox Code Playgroud)

但是会自动处理。您也可以根据需要传递其他参数drophttp : //statsmodels.sourceforge.net/devel/genic/statsmodels.regression.linear_model.OLS.html