Nig*_*ker 7 python dataframe pandas
我如何以更多的熊猫方式编写以下函数:
def calculate_df_columns_mean(self, df):
means = {}
for column in df.columns.columns.tolist():
cleaned_data = self.remove_outliers(df[column].tolist())
means[column] = np.mean(cleaned_data)
return means
Run Code Online (Sandbox Code Playgroud)
感谢帮助。
使用dataFrame.apply(func, axis=0):
# axis=0 means apply to columns; axis=1 to rows
df.apply(numpy.sum, axis=0) # equiv to df.sum(0)
Run Code Online (Sandbox Code Playgroud)
在我看来,对列的迭代是不必要的:
def calculate_df_columns_mean(self, df):
cleaned_data = self.remove_outliers(df[column].tolist())
return cleaned_data.mean()
Run Code Online (Sandbox Code Playgroud)
假设remove_outliers仍然返回 df ,上面应该足够了
编辑
我认为以下应该有效:
def calculate_df_columns_mean(self, df):
return df.apply(lambda x: remove_outliers(x.tolist()).mean()
Run Code Online (Sandbox Code Playgroud)