Pandas使用或声明切片/选择多个条件

jto*_*rca 4 python python-3.x pandas

当我选择用"AND"链接不同的条件时,选择工作正常.当我通过链接条件选择"OR"时,选择会引发错误.

>>> import pandas as pd
>>> import numpy as np
>>> df = pd.DataFrame([[1,4,3],[2,3,5],[4,5,6],[3,2,5]], 
...     columns=['a', 'b', 'c'])
>>> df
   a  b  c
0  1  4  3
1  2  3  5
2  4  5  6
3  3  2  5
>>> df.loc[(df.a != 1) & (df.b < 5)]
   a  b  c
1  2  3  5
3  3  2  5
>>> df.loc[(df.a != 1) or (df.b < 5)]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3/dist-packages/pandas/core/generic.py", line 731, in __nonzero__
    .format(self.__class__.__name__))
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
Run Code Online (Sandbox Code Playgroud)

我希望它能返回整个数据帧,因为所有行都满足这个条件.

Ste*_*nes 13

需要注意的重要一点是,&不等于and他们是不同的东西,因此"或"等同于对&|

通常两者&|按位逻辑运算符,而不是蟒蛇"逻辑"运算符.

在大熊猫中,这些操作符过载以进行Series操作.

In [1]: import pandas as pd

In [2]: import numpy as np

In [3]: df = pd.DataFrame([[1,4,3],[2,3,5],[4,5,6],[3,2,5]], columns=['a', 'b',
   ...:  'c'])

In [4]: df
Out[4]:
   a  b  c
0  1  4  3
1  2  3  5
2  4  5  6
3  3  2  5

In [5]: df.loc[(df.a != 1) & (df.b < 5)]
Out[5]:
   a  b  c
1  2  3  5
3  3  2  5

In [6]: df.loc[(df.a != 1) | (df.b < 5)]
Out[6]:
   a  b  c
0  1  4  3
1  2  3  5
2  4  5  6
3  3  2  5
Run Code Online (Sandbox Code Playgroud)

  • 这里`loc`可以省略,它是纯的[`boolean indexing`](http://pandas.pydata.org/pandas-docs/stable/indexing.html#boolean-indexing).如果需要选择一些列,如`df.loc [(df.a!= 1)|,则使用`loc`.(df.b <5),'a']`或`df.loc [(df.a!= 1)| (df.b <5),['a','b']]` (5认同)