熊猫滚动意味着不要在DataFrame中将数字更改为NaN

can*_*did 3 moving-average dataframe python-3.x pandas

我正在使用看起来像这样的pandas DataFrame:

(** NB-偏移量设置为DataFrame的索引)

offset         X         Y         Z
  0   -0.140137   -1.924316   -0.426758
 10   -2.789123   -1.111212   -0.416016
 20   -0.133789   -1.923828   -4.408691
 30   -0.101112   -1.457891   -0.425781
 40   -0.126465   -1.926758   -0.414062
 50   -0.137207   -1.916992   -0.404297
 60   -0.130371   -3.784591   -0.987654
 70   -0.125000   -1.918457   -0.403809
 80   -0.123456   -1.917480   -0.413574
 90   -0.126465   -1.926758   -0.333554
Run Code Online (Sandbox Code Playgroud)

我已使用以下代码将窗口大小= 5的滚动平均值应用于数据帧。我需要保持此窗口大小= 5,并且需要所有偏移值(无NaN)的整个数据帧的值。

df = df.rolling(center=False, window=5).mean()
Run Code Online (Sandbox Code Playgroud)

这给了我:

offset         X         Y         Z
 0.0       NaN       NaN       NaN
10.0       NaN       NaN       NaN
20.0       NaN       NaN       NaN
30.0       NaN       NaN       NaN
40.0 -0.658125 -1.668801 -1.218262
50.0 -0.657539 -1.667336 -1.213769
60.0 -0.125789 -2.202012 -1.328097
70.0 -0.124031 -2.200938 -0.527121
80.0 -0.128500 -2.292856 -0.524679
90.0 -0.128500 -2.292856 -0.508578
Run Code Online (Sandbox Code Playgroud)

我希望DataFrame能够保持NaN的第一个值不变,而其余的值作为滚动平均值的结果。有什么简单的方法可以做到这一点?谢谢

offset         X         Y         Z
 0.0  -0.140137  -1.924316  -0.426758
10.0  -2.789123  -1.111212  -0.416016
20.0  -0.133789  -1.923828  -4.408691
30.0  -0.101112  -1.457891  -0.425781
40.0  -0.658125  -1.668801  -1.218262
50.0  -0.657539  -1.667336  -1.213769
60.0  -0.125789  -2.202012  -1.328097
70.0  -0.124031  -2.200938  -0.527121
80.0  -0.128500  -2.292856  -0.524679
90.0  -0.128500  -2.292856  -0.508578
Run Code Online (Sandbox Code Playgroud)

ayh*_*han 5

您可以填写原始df:

df.rolling(center=False, window=5).mean().fillna(df)
Out: 
               X         Y         Z
offset                              
0      -0.140137 -1.924316 -0.426758
10     -2.789123 -1.111212 -0.416016
20     -0.133789 -1.923828 -4.408691
30     -0.101112 -1.457891 -0.425781
40     -0.658125 -1.668801 -1.218262
50     -0.657539 -1.667336 -1.213769
60     -0.125789 -2.202012 -1.328097
70     -0.124031 -2.200938 -0.527121
80     -0.128500 -2.292856 -0.524679
90     -0.128500 -2.292856 -0.508578
Run Code Online (Sandbox Code Playgroud)

还有一个参数min_periods可以使用。如果通过min_periods=1,则将第一个值保持不变,第二个值作为前两个值的平均值,依此类推。在某些情况下,它可能更有意义。

df.rolling(center=False, window=5, min_periods=1).mean()
Out: 
               X         Y         Z
offset                              
0      -0.140137 -1.924316 -0.426758
10     -1.464630 -1.517764 -0.421387
20     -1.021016 -1.653119 -1.750488
30     -0.791040 -1.604312 -1.419311
40     -0.658125 -1.668801 -1.218262
50     -0.657539 -1.667336 -1.213769
60     -0.125789 -2.202012 -1.328097
70     -0.124031 -2.200938 -0.527121
80     -0.128500 -2.292856 -0.524679
90     -0.128500 -2.292856 -0.508578
Run Code Online (Sandbox Code Playgroud)