基于前一行值的熊猫填充

Question

基于前一行值的熊猫填充

我有一个这样的专栏

valueCount
0.0
nan
2.0
1.0
1.0
1.0
nan
nan
nan
4.0

Run Code Online (Sandbox Code Playgroud)

我想根据可用的下一个值（加）或上一个值（减）来填充。所以结果应该是

valueCount
0.0
**1.0**
2.0
1.0
1.0
1.0
**1.0**
**2.0**
**3.0**
4.0

Run Code Online (Sandbox Code Playgroud)

我知道这是非常有条件的，如果我以前的值是 0，我可以将 +1 添加到 nan 行，否则我应该从 0,1,2 开始添加，依此类推。

我可以在简单的 python 列表中执行这个算法，但是在 Pandas 中有什么简单的方法吗？

Answer 1

jez*_*ael 5

您可以使用：

a = df['valueCount'].isnull()
b = a.cumsum()
c = df['valueCount'].bfill()
d = c + (b-b.mask(a).bfill().fillna(0).astype(int)).sub(1)
df['valueCount'] =  df['valueCount'].fillna(d)
print (df)

   valueCount
0         0.0
1         1.0
2         2.0
3         1.0
4         1.0
5         1.0
6         1.0
7         2.0
8         3.0
9         4.0

Run Code Online (Sandbox Code Playgroud)

细节+解释：

#back filling NaN values
x = df['valueCount'].bfill()
#compare by NaNs
a = df['valueCount'].isnull()
#cumulative sum of mask
b = a.cumsum()
#replace Trues to NaNs
c = b.mask(a)
#forward fill NaNs
d = b.mask(a).bfill()
#First NaNs to 0 and cast to integers
e = b.mask(a).bfill().fillna(0).astype(int)
#add to backfilled Series cumulative sum and subtract from cumulative sum Series, 1
f = x + b - e - 1
#replace NaNs by Series f
g = df['valueCount'].fillna(f)
df = pd.concat([df['valueCount'], x, a, b, c, d, e, f, g], axis=1, 
               keys=('orig','x','a','b','c','d','e', 'f', 'g'))
print (df)
   orig    x      a  b    c    d  e    f    g
0   0.0  0.0  False  0  0.0  0.0  0 -1.0  0.0
1   NaN  2.0   True  1  NaN  1.0  1  1.0  1.0
2   2.0  2.0  False  1  1.0  1.0  1  1.0  2.0
3   1.0  1.0  False  1  1.0  1.0  1  0.0  1.0
4   1.0  1.0  False  1  1.0  1.0  1  0.0  1.0
5   1.0  1.0  False  1  1.0  1.0  1  0.0  1.0
6   NaN  4.0   True  2  NaN  4.0  4  1.0  1.0
7   NaN  4.0   True  3  NaN  4.0  4  2.0  2.0
8   NaN  4.0   True  4  NaN  4.0  4  3.0  3.0
9   4.0  4.0  False  4  4.0  4.0  4  3.0  4.0

Run Code Online (Sandbox Code Playgroud)

归档时间：	7 年，7 月前
查看次数：	1728 次
最近记录：	7 年，7 月前