ken*_*ait 19 python dataframe pandas
在pandas中,给定一个DataFrame D:
+-----+--------+--------+--------+
| | 1 | 2 | 3 |
+-----+--------+--------+--------+
| 0 | apple | banana | banana |
| 1 | orange | orange | orange |
| 2 | banana | apple | orange |
| 3 | NaN | NaN | NaN |
| 4 | apple | apple | apple |
+-----+--------+--------+--------+
Run Code Online (Sandbox Code Playgroud)
当有三列或更多列时,如何返回其所有列中具有相同内容的行,以便它返回:
+-----+--------+--------+--------+
| | 1 | 2 | 3 |
+-----+--------+--------+--------+
| 1 | orange | orange | orange |
| 4 | apple | apple | apple |
+-----+--------+--------+--------+
Run Code Online (Sandbox Code Playgroud)
请注意,当所有值都是NaN时,它会跳过行.
如果这只是两列,我通常会这样做,D[D[1]==D[2]]但我不知道如何为超过2列的DataFrames推广这一点.
DSM*_*DSM 13
我的条目:
>>> df
0 1 2
0 apple banana banana
1 orange orange orange
2 banana apple orange
3 NaN NaN NaN
4 apple apple apple
[5 rows x 3 columns]
>>> df[df.apply(pd.Series.nunique, axis=1) == 1]
0 1 2
1 orange orange orange
4 apple apple apple
[2 rows x 3 columns]
Run Code Online (Sandbox Code Playgroud)
这是有效的,因为调用pd.Series.nunique行给出:
>>> df.apply(pd.Series.nunique, axis=1)
0 2
1 1
2 3
3 0
4 1
dtype: int64
Run Code Online (Sandbox Code Playgroud)
注意:但是,这会保留看起来像[nan, nan, apple]或的行[nan, apple, apple].通常我想要,但这可能是你的用例的错误答案.
low*_*ech 12
类似于Andy Hayden的回答,检查min是否等于max(然后行元素都是重复的):
df[df.apply(lambda x: min(x) == max(x), 1)]
Run Code Online (Sandbox Code Playgroud)
我会检查每一行是否等于它的第一个元素:
In [11]: df.eq(df[1], axis='index') # Note: funky broadcasting with df == df[1]
Out[11]:
1 2 3
0 True False False
1 True True True
2 True False False
3 True True True
4 True True True
[5 rows x 3 columns]
Run Code Online (Sandbox Code Playgroud)
如果行中的所有内容都为True,则行中的所有元素都相同:
In [12]: df.eq(df[1], axis='index').all(1)
Out[12]:
0 False
1 True
2 False
3 True
4 True
dtype: bool
Run Code Online (Sandbox Code Playgroud)
仅限于行和可选的dropna:
In [13]: df[df.eq(df[1], axis='index').all(1)]
Out[13]:
1 2 3
1 orange orange orange
3 NaN NaN NaN
4 apple apple apple
[3 rows x 3 columns]
In [14]: df[df.eq(df[1], axis='index').all(1)].dropna()
Out[14]:
1 2 3
1 orange orange orange
4 apple apple apple
[2 rows x 3 columns]
Run Code Online (Sandbox Code Playgroud)