Python Pandas Dataframe:取最大值小于

6 python dataframe pandas

在 Python 中,我有一个 pandas 数据框。我想过滤 column 的一个值A

I am looking for the row, where column A is the highest value that is smaller than '5' (so if column A does have values '1', '2', '4', '7', it should be '4'). Another condition exists, too.

The following statement does not work.

How do I have to change it with regards to the maximum condition, so that it is working?

df_new = df[(df['some_other_column'] < XYZ) & max(df['A'] <= '5')]
Run Code Online (Sandbox Code Playgroud)

cs9*_*s95 5

使用np.searchsorted-

\n\n
df\n\n   x\n0  1\n1  2\n2  4\n3  7\n\ndf.iloc[(np.searchsorted(df.x.values, 5) - 1).clip(0)]\n\n   x\n2  4\n
Run Code Online (Sandbox Code Playgroud)\n\n
\n\n

时间安排

\n\n
df = pd.DataFrame({'x' : np.arange(100000)})\n
Run Code Online (Sandbox Code Playgroud)\n\n\n\n
%%timeit \nx = df.x\ng = x[x <= 12345].max()\ndf[x == g]\n\n1000 loops, best of 3: 1.27 ms per loop\n
Run Code Online (Sandbox Code Playgroud)\n\n\n\n
%timeit df.iloc[(np.searchsorted(df.x.values, 12345) - 1).clip(0)]\n10000 loops, best of 3: 139 \xc2\xb5s per loop\n
Run Code Online (Sandbox Code Playgroud)\n\n

没有可比性,使用起来searchsorted要快得多。

\n