一次在数据框列上应用几个功能

Nig*_*ker 2 python dataframe pandas

如何在Dataframe上应用几个功能:

我想做类似的事情:

features_df[features_columns].apply(lambda x: np.mean(x), lambda x: np.std(x), lambda x: np.skew(x))
Run Code Online (Sandbox Code Playgroud)

谢谢

jez*_*ael 5

我认为您需要DataFrame.aggregatepandas 0.20.0+)或DataFrame.apply

features_df[features_columns].agg(lambda x: pd.Series([np.mean(x),np.std(x)]))

features_df[features_columns].apply(lambda x: pd.Series([np.mean(x),np.std(x)]))
Run Code Online (Sandbox Code Playgroud)
df = features_df[features_columns].agg(['mean', 'std', 'skew'])

df = features_df[features_columns].apply(['mean', 'std', 'skew'])
Run Code Online (Sandbox Code Playgroud)

样品:

features_df = pd.DataFrame({'A':list('abcdef'),
                           'B':[4,5,4,5,5,4],
                           'C':[7,8,9,4,2,3],
                           'D':[1,3,5,7,1,0],
                           'E':[5,3,6,9,2,4],
                           'F':list('aaabbb')})

print (features_df)
   A  B  C  D  E  F
0  a  4  7  1  5  a
1  b  5  8  3  3  a
2  c  4  9  5  6  a
3  d  5  4  7  9  b
4  e  5  2  1  2  b
5  f  4  3  0  4  b
Run Code Online (Sandbox Code Playgroud)
features_columns = ['B','C']


print (features_df[features_columns].agg(lambda x: pd.Series([np.mean(x),np.std(x)])))
     B         C
0  4.5  5.500000
1  0.5  2.629956

print (features_df[features_columns].apply(lambda x: pd.Series([np.mean(x),np.std(x)])))
     B         C
0  4.5  5.500000
1  0.5  2.629956

print (features_df[features_columns].agg(['mean', 'std', 'skew']))
             B         C
mean  4.500000  5.500000
std   0.547723  2.880972
skew  0.000000  0.000000

print (features_df[features_columns].apply(['mean', 'std', 'skew']))
             B         C
mean  4.500000  5.500000
std   0.547723  2.880972
skew  0.000000  0.000000
Run Code Online (Sandbox Code Playgroud)

std函数ddofnumpy和中具有不同的默认值pandas,因此输出也不同。

np.skew返回:

AttributeError:模块“ numpy”没有属性“ skew”