Def*_*_Os 8 python pandas pandas-groupby
我有一个DataFrame,其列包含一些带有各种负值的错误数据.我想将值<0替换为它们所在组的平均值.
对于作为NA的缺失值,我会这样做:
data = df.groupby(['GroupID']).column
data.transform(lambda x: x.fillna(x.mean()))
Run Code Online (Sandbox Code Playgroud)
但是如何在这样的条件下做这个操作x < 0呢?
谢谢!
使用@ AndyHayden的示例,您可以使用groupby/ transformwith replace:
df = pd.DataFrame([[1,1],[1,-1],[2,1],[2,2]], columns=list('ab'))
print(df)
# a b
# 0 1 1
# 1 1 -1
# 2 2 1
# 3 2 2
data = df.groupby(['a'])
def replace(group):
mask = group<0
# Select those values where it is < 0, and replace
# them with the mean of the values which are not < 0.
group[mask] = group[~mask].mean()
return group
print(data.transform(replace))
# b
# 0 1
# 1 1
# 2 1
# 3 2
Run Code Online (Sandbox Code Playgroud)