如果某些数据点是NaN,如何拟合二维函数?

Yoo*_*ach 3 python numpy missing-data model-fitting astropy

我正在尝试将2D表面适合数据。更具体地说,我想找到一个将像素坐标映射到波长坐标的函数,就像FITCOORDS在IRAF中一样。

举例来说,我想test在以下代码片段中找到适合数组的内容:

import numpy as np
from astropy.modeling.models import Chebyshev2D
from astropy.modeling.fitting import LevMarLSQFitter
#%%
test = np.array([[7473, 7040, 6613, 6183, 5753, 5321, 4888],
   [7474, 7042, 6616, 6186, np.nan, 5325, 4893],
   [7476, 7044, 6619, 6189, 5759, 5328, 4897],
   [7479, 7047, np.nan, 6192, 5762, 5331, 4900]])
grid_pix, grid_wave = np.mgrid[:4, :7]
fitter = LevMarLSQFitter()
c2_init = Chebyshev2D(x_degree=3, y_degree=3)
c2_fit = fitter(c2_init, grid_wave, grid_pix, test)
print(c2_fit)
Run Code Online (Sandbox Code Playgroud)

ResultI 在Python 3.6 上astropy 2.0.2numpy 1.13.3以下:

Model: Chebyshev2D
Inputs: ('x', 'y')
Outputs: ('z',)
Model set size: 1
X-Degree: 3
Y-Degree: 3
Parameters:
    c0_0 c1_0 c2_0 c3_0 c0_1 c1_1 c2_1 c3_1 c0_2 c1_2 c2_2 c3_2 c0_3 c1_3 c2_3 c3_3
    ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ----
     0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
WARNING: Model is linear in parameters; consider using linear fitting methods. [astropy.modeling.fitting]
Run Code Online (Sandbox Code Playgroud)

显然,该配件从未奏效。

如果将更np.nan改为某些值,则拟合将按预期工作(例如,将np.nans 手动更改为0、1等)。

我应该如何获得合理的结果?我可以让钳工忽略np.nan值吗?

MSe*_*ert 5

您只需删除nan数据中的s和网格的相应“索引”。例如,使用布尔索引

notnans = np.isfinite(test)  # array containing True for finite values and False for nans/infs
c2_fit = fitter(c2_init, grid_wave[notnans], grid_pix[notnans], test[notnans])
print(c2_fit)
Run Code Online (Sandbox Code Playgroud)

它仍然会打印关于线性参数的警告,但是值肯定非零:

Model: Chebyshev2D
Inputs: ('x', 'y')
Outputs: ('z',)
Model set size: 1
X-Degree: 3
Y-Degree: 3
Parameters:
         c0_0          c1_0           c2_0      ...       c2_3            c3_3      
    ------------- -------------- -------------- ... --------------- ----------------
    7473.01546325 -431.633443323 0.471726190475 ... 0.0229037267082 -0.0012077294686
Run Code Online (Sandbox Code Playgroud)

这里的技巧是xy并且您data不必是2D数组,它们可以是1D数组(由布尔索引返回),只要它们“表示” 2D网格即可。

以防万一您有包含NaN的“大区域”,这种方法可能不够好,因为装配工可以在那里放置任何东西。如果是这种情况,您可以使用以下方法对这些区域进行插值astropy.convolution.convolve,然后将结果的NaN替换dataconvolve

from astropy.convolution import convolve
# Just for illustration I used a 5x5 mean filter here, the size must be adjusted
# depending on the size of all-nan-regions in your data.
mean_convolved = convolve(test, np.ones((5, 5)), boundary='extend')
# Replacing NaNs in test with the mean_convolved values:
nan_mask = np.isnan(test)
test[nan_mask] = mean_convolved[nan_mask]

# Now pass test to the fitter:
c2_fit = fitter(c2_init, grid_wave, grid_pix, test)
Run Code Online (Sandbox Code Playgroud)

但是,对于一些稀疏分布的NaN,不需要卷积。您可能需要比较这两种方法,然后看看哪种方法给出的结果更“值得信赖”。缺少值可能是拟合的实际问题。