Python Pandas Dataframe：取最大值小于

Question

Python Pandas Dataframe：取最大值小于

在 Python 中，我有一个 pandas 数据框。我想过滤 column 的一个值A。

I am looking for the row, where column A is the highest value that is smaller than '5' (so if column A does have values '1', '2', '4', '7', it should be '4'). Another condition exists, too.

The following statement does not work.

How do I have to change it with regards to the maximum condition, so that it is working?

df_new = df[(df['some_other_column'] < XYZ) & max(df['A'] <= '5')]

Run Code Online (Sandbox Code Playgroud)

Answer 1

cs9*_*s95 5

使用np.searchsorted-

\n\n

df\n\n   x\n0  1\n1  2\n2  4\n3  7\n\ndf.iloc[(np.searchsorted(df.x.values, 5) - 1).clip(0)]\n\n   x\n2  4\n

Run Code Online (Sandbox Code Playgroud)\n\n

\n\n

时间安排

\n\n

df = pd.DataFrame({'x' : np.arange(100000)})\n

Run Code Online (Sandbox Code Playgroud)\n\n\n\n

%%timeit \nx = df.x\ng = x[x <= 12345].max()\ndf[x == g]\n\n1000 loops, best of 3: 1.27 ms per loop\n

Run Code Online (Sandbox Code Playgroud)\n\n\n\n

%timeit df.iloc[(np.searchsorted(df.x.values, 12345) - 1).clip(0)]\n10000 loops, best of 3: 139 \xc2\xb5s per loop\n

Run Code Online (Sandbox Code Playgroud)\n\n

没有可比性，使用起来searchsorted要快得多。

\n

归档时间：	7 年，11 月前
查看次数：	5107 次
最近记录：	1 年，10 月前