Geo*_*nce 5 python conditional-statements dataframe pandas pandas-groupby
Survived SibSp Parch
0 0 1 0
1 1 1 0
2 1 0 0
3 1 1 0
4 0 0 1
Run Code Online (Sandbox Code Playgroud)
鉴于上述数据框架,groupby
有条件的优雅方式吗?我想根据以下条件将数据拆分为两组:
(df['SibSp'] > 0) | (df['Parch'] > 0) = New Group -"Has Family"
(df['SibSp'] == 0) & (df['Parch'] == 0) = New Group - "No Family"
Run Code Online (Sandbox Code Playgroud)
然后采取这两个组的方法,最终输出如下:
SurvivedMean
Has Family Mean
No Family Mean
Run Code Online (Sandbox Code Playgroud)
可以使用groupby完成,还是必须使用上述条件语句追加新列?
谢谢!
一种简单的分组方法是使用这两列的总和.如果它们中的任何一个为正,则结果将大于1.并且groupby接受任意数组,只要该长度与DataFrame的长度相同,因此您不需要添加新列.
family = np.where((df['SibSp'] + df['Parch']) >= 1 , 'Has Family', 'No Family')
df.groupby(family)['Survived'].mean()
Out:
Has Family 0.5
No Family 1.0
Name: Survived, dtype: float64
Run Code Online (Sandbox Code Playgroud)