ValueError:系列的真值不明确

Ank*_*wal 2 dataframe python-2.7 pandas

>>> df.head()\n                         \xe2\x84\x96 Summer  Gold  Silver  Bronze  Total  \xe2\x84\x96 Winter  \\\nAfghanistan (AFG)              13     0       0       2      2         0\nAlgeria (ALG)                  12     5       2       8     15         3\nArgentina (ARG)                23    18      24      28     70        18\nArmenia (ARM)                   5     1       2       9     12         6\nAustralasia (ANZ) [ANZ]         2     3       4       5     12         0\n\n                         Gold.1  Silver.1  Bronze.1  Total.1  \xe2\x84\x96 Games  Gold.2  \\\nAfghanistan (AFG)             0         0         0        0       13       0\nAlgeria (ALG)                 0         0         0        0       15       5\nArgentina (ARG)               0         0         0        0       41      18\nArmenia (ARM)                 0         0         0        0       11       1\nAustralasia (ANZ) [ANZ]       0         0         0        0        2       3\n\n                         Silver.2  Bronze.2  Combined total\nAfghanistan (AFG)               0         2               2\nAlgeria (ALG)                   2         8              15\nArgentina (ARG)                24        28              70\nArmenia (ARM)                   2         9              12\nAustralasia (ANZ) [ANZ]         4         5              12\n
Run Code Online (Sandbox Code Playgroud)\n\n

不知道为什么我会看到这个错误:

\n\n
>>> df[\'Gold\'] > 0  | df[\'Gold.1\'] > 0\nTraceback (most recent call last):\n  File "<stdin>", line 1, in <module>\n  File "/Users/ankuragarwal/data_insight/env/lib/python2.7/site-packages/pandas/core/generic.py", line 917, in __nonzero__\n    .format(self.__class__.__name__))\nValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().\n
Run Code Online (Sandbox Code Playgroud)\n\n

这里有什么暧昧的地方?

\n\n

但这有效:

\n\n
>>> (df[\'Gold\'] > 0)  | (df[\'Gold.1\'] > 0)\n
Run Code Online (Sandbox Code Playgroud)\n

Max*_*axU 5

假设我们有以下 DF:

In [35]: df
Out[35]:
   a  b  c
0  9  0  1
1  7  7  4
2  1  8  9
3  6  7  5
4  1  4  6
Run Code Online (Sandbox Code Playgroud)

以下命令:

df.a > 5 | df.b > 5
Run Code Online (Sandbox Code Playgroud)

因为|具有更高的优先级(与运算符优先级表>中指定的相比),因此它将被转换为:

df.a > (5 | df.b) > 5
Run Code Online (Sandbox Code Playgroud)

这将被翻译为:

df.a > (5 | df.b) and (5 | df.b) > 5
Run Code Online (Sandbox Code Playgroud)

一步步:

In [36]: x = (5 | df.b)

In [37]: x
Out[37]:
0     5
1     7
2    13
3     7
4     5
Name: c, dtype: int32

In [38]: df.a > x
Out[38]:
0     True
1    False
2    False
3    False
4    False
dtype: bool

In [39]: x > 5
Out[39]:
0    False
1     True
2     True
3     True
4    False
Name: b, dtype: bool
Run Code Online (Sandbox Code Playgroud)

但最后一个操作不起作用

In [40]: (df.a > x) and (x > 5)
---------------------------------------------------------------------------
...
skipped
...
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
Run Code Online (Sandbox Code Playgroud)

上面的错误消息可能会导致没有经验的用户执行以下操作:

In [12]: (df.a > 5).all() | (df.b > 5).all()
Out[12]: False

In [13]: df[(df.a > 5).all() | (df.b > 5).all()]
...
skipped
...
KeyError: False
Run Code Online (Sandbox Code Playgroud)

但在这种情况下,您只需要明确设置优先级即可获得预期结果:

In [10]: (df.a > 5) | (df.b > 5)
Out[10]:
0     True
1     True
2     True
3     True
4    False
dtype: bool

In [11]: df[(df.a > 5) | (df.b > 5)]
Out[11]:
   a  b  c
0  9  0  1
1  7  7  4
2  1  8  9
3  6  7  5
Run Code Online (Sandbox Code Playgroud)