我有pandas DataFrame,我想根据多个条件进行过滤.条件作为具有可变长度的列表传递.
以下是过滤DataFrame的基本代码:
>>>> conditions = ["a","b"]
>>>> df1 = pd.DataFrame(["a","a","b","b","c"])
>>>> df1
0
0 a
1 a
2 b
3 b
4 c
>>>> df2 = df1[(df1[0] == conditions[0]) | (df1[0] == conditions[1])]
>>>> df2
0
0 a
1 a
2 b
3 b
Run Code Online (Sandbox Code Playgroud)
如果我不知道传入的条件数量,是否有一种简单的方法可以检查所有条件?
谢谢
NumPy ufuncs,如np.logical_or,有一个reduce方法:
import numpy as np
mask = np.logical_or.reduce([(df1[0] == cond) for cond in conditions])
df2 = df1[mask]
Run Code Online (Sandbox Code Playgroud)