使用“OR”在熊猫中选择数据

Question

使用“OR”在熊猫中选择数据

我有一个值的数据框，我想探索异常值的行。我在下面写了一个可以用该groupby().apply()函数调用的函数，它适用于高值或低值，但是当我想将它们组合在一起时，我会产生一个错误。我以某种方式搞乱了布尔OR选择，但我只能找到使用&. 任何建议，将不胜感激。

扎克

df = DataFrame( {'a': [1,1,1,2,2,2,2,2,2,2], 'b': [5,5,6,9,9,9,9,9,9,20] } )

#this works fine
def get_outliers(group):
    x = mean(group.b)
    y = std(group.b)
    top_cutoff =    x + 2*y
    bottom_cutoff = x - 2*y
    cutoffs = group[group.b > top_cutoff]
    return cutoffs

#this will trigger an error
def get_all_ outliers(group):
    x = mean(group.b)
    y = std(group.b)
    top_cutoff =    x + 2*y
    bottom_cutoff = x -2*y
    cutoffs = group[(group.b > top_cutoff) or (group.b < top_cutoff)]
    return cutoffs

#works fine    
grouped1 = df.groupby(['a']).apply(get_outliers)
#triggers error
grouped2 = df.groupby(['a']).apply(get_all_outliers)

Run Code Online (Sandbox Code Playgroud)

Answer 1

Bre*_*arn 7

您需要使用|而不是or. 在and和or运营商都在Python特殊的，不相互作用以及与像numpy的东西，熊猫，尝试跨越的elementwise集合适用于他们。因此，对于这些上下文，他们重新定义了“按位”运算符&并|表示“和”和“或”。

归档时间：	13 年，1 月前
查看次数：	5274 次
最近记录：	13 年，1 月前