转发填充pandas数据帧中的特定列

Question

转发填充pandas数据帧中的特定列

如果我有一个包含多个列['x','y','z'的数据帧df,我如何只转发填充一列'x'或一组列['x','y']？

我只知道如何通过轴来做到这一点.

Answer 1

@Woody Pride建议的for循环是不必要的.你可以把它减少到:

cols = ['X', 'Y']
df.loc[:,col] = df.loc[:,col].ffill()

Run Code Online (Sandbox Code Playgroud)

我还添加了一个包含自我的示例:

>>> import pandas as pd
>>> import numpy as np
>>> 
>>> #%% create dataframe
... ts1 = [0, 1, np.nan, np.nan, np.nan, np.nan]
>>> ts2 = [0, 2, np.nan, 3, np.nan, np.nan]
>>> d =  {'X': ts1, 'Y': ts2, 'Z': ts2}
>>> df = pd.DataFrame(data=d)
>>> print(df.head())
    X   Y   Z
0   0   0   0
1   1   2   2
2 NaN NaN NaN
3 NaN   3   3
4 NaN NaN NaN
>>> 
>>> #%% apply forward fill
... col = ['X', 'Y']
>>> df.loc[:,col] = df.loc[:,col].ffill()
>>> print(df.head())
   X  Y   Z
0  0  0   0
1  1  2   2
2  1  2 NaN
3  1  3   3
4  1  3 NaN

Run Code Online (Sandbox Code Playgroud)

(通常我会对@Woody Pride的答案发表评论,但我没有代表.)

请参阅此处有关 Python 中索引的首选方式的描述：http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html 更改为 .loc 语句应该可以解决问题，我已经更新了答案因此。 (2认同)

Answer 2

Uwe*_*yer 11

或者使用inplace参数：

df['X'].ffill(inplace=True)
df['Y'].ffill(inplace=True)

Run Code Online (Sandbox Code Playgroud)

不，您不能这样做df[['X','Y]].ffill(inplace=True)，因为这首先会通过列选择创建一个切片，因此就地前向填充将创建一个SettingWithCopyWarning。当然，如果您有一个列列表，您可以循环执行此操作：

for col in ['X', 'Y']:
    df[col].ffill(inplace=True)

Run Code Online (Sandbox Code Playgroud)

使用的目的inplace是避免复制列。

Answer 3

Woo*_*ide 9

for col in ['X', 'Y']:
    df[col] = df[col].ffill()

Run Code Online (Sandbox Code Playgroud)

Answer 4

小智 5

两列可以ffill()同时存在，如下所示：

df1 = df[['X','Y']].ffill()

Run Code Online (Sandbox Code Playgroud)

归档时间：	11 年前
查看次数：	26400 次
最近记录：	6 年，6 月前