我有一个pandas DataFrame df:
import pandas as pd
data = {"Name": ["AAAA", "BBBB"],
"C1": [25, 12],
"C2": [2, 1],
"C3": [1, 10]}
df = pd.DataFrame(data)
df.set_index("Name")
Run Code Online (Sandbox Code Playgroud)
打印时看起来像这样(供参考):
C1 C2 C3
Name
AAAA 25 2 1
BBBB 12 1 10
Run Code Online (Sandbox Code Playgroud)
我想选择哪些行C1,C2并且C3在0和之间有值20.
你能建议一种优雅的方式来选择这些行吗?
ken*_*nes 21
我认为下面应该这样做,但它的优雅是有争议的.
new_df = old_df[((old_df['C1'] > 0) & (old_df['C1'] < 20)) & ((old_df['C2'] > 0) & (old_df['C2'] < 20)) & ((old_df['C3'] > 0) & (old_df['C3'] < 20))]
Run Code Online (Sandbox Code Playgroud)
EdC*_*ica 13
更短的版本:
In [65]:
df[(df>=0)&(df<=20)].dropna()
Out[65]:
Name C1 C2 C3
1 BBBB 12 1 10
Run Code Online (Sandbox Code Playgroud)
小智 8
我喜欢使用df.query()来做这些事情
df.query('C1>=0 and C1<=20 and C2>=0 and C2<=20 and C3>=0 and C3<=20')
Run Code Online (Sandbox Code Playgroud)
df.query(
"0 < C1 < 20 and 0 < C2 < 20 and 0 < C3 < 20"
)
Run Code Online (Sandbox Code Playgroud)
或者
df.query("0 < @df < 20").dropna()
Run Code Online (Sandbox Code Playgroud)
使用@fooindf.query是指foo环境中的变量。