在Python 3.3.2上使用Pandas 0.12.0和Matplotlib 1.3.1绘制包含NaN的DataFrame时出错

Nic*_*s G 6 python plot matplotlib nan pandas

首先,这个问题是一样的这一个.

我遇到的问题是,当我尝试在一个单元格中绘制一个包含numpy NaN的DataFrame时,我收到一个错误:

C:\>\Python33x86\python.exe
Python 3.3.2 (v3.3.2:d047928ae3f6, May 16 2013, 00:03:43) [MSC v.1600 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import pandas as pd
>>> import numpy as np
>>> import matplotlib.pyplot as plt
>>>
>>> dates = pd.date_range('20131201', periods=5, freq='H')
>>> data = [[1, 2], [4, 5], [9, np.nan], [16, 17], [25, 26]]
>>> df = pd.DataFrame(data, index=dates,
...                       columns=list('AB'))
>>>
>>> print(df.to_string())
                      A   B
2013-12-01 00:00:00   1   2
2013-12-01 01:00:00   4   5
2013-12-01 02:00:00   9 NaN
2013-12-01 03:00:00  16  17
2013-12-01 04:00:00  25  26
>>> df.plot()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Python33x86\lib\site-packages\pandas\tools\plotting.py", line 1636, in plot_frame
    plot_obj.generate()
  File "C:\Python33x86\lib\site-packages\pandas\tools\plotting.py", line 856, in generate
    self._make_plot()
  File "C:\Python33x86\lib\site-packages\pandas\tools\plotting.py", line 1240, in _make_plot
    self._make_ts_plot(data, **self.kwds)
  File "C:\Python33x86\lib\site-packages\pandas\tools\plotting.py", line 1321, in _make_ts_plot
    _plot(data[col], i, ax, label, style, **kwds)
  File "C:\Python33x86\lib\site-packages\pandas\tools\plotting.py", line 1295, in _plot
    style=style, **kwds)
  File "C:\Python33x86\lib\site-packages\pandas\tseries\plotting.py", line 77, in tsplot
    lines = plotf(ax, *args, **kwargs)
  File "C:\Python33x86\lib\site-packages\matplotlib\axes.py", line 4139, in plot
    for line in self._get_lines(*args, **kwargs):
  File "C:\Python33x86\lib\site-packages\matplotlib\axes.py", line 319, in _grab_next_args
    for seg in self._plot_args(remaining, kwargs):
  File "C:\Python33x86\lib\site-packages\matplotlib\axes.py", line 297, in _plot_args
    x, y = self._xy_from_xy(x, y)
  File "C:\Python33x86\lib\site-packages\matplotlib\axes.py", line 216, in _xy_from_xy
    by = self.axes.yaxis.update_units(y)
  File "C:\Python33x86\lib\site-packages\matplotlib\axis.py", line 1337, in update_units
    converter = munits.registry.get_converter(data)
  File "C:\Python33x86\lib\site-packages\matplotlib\units.py", line 137, in get_converter
    xravel = x.ravel()
  File "C:\Python33x86\lib\site-packages\numpy\ma\core.py", line 3969, in ravel
    r._mask = ndarray.ravel(self._mask).reshape(r.shape)
  File "C:\Python33x86\lib\site-packages\pandas\core\series.py", line 981, in reshape
    return ndarray.reshape(self, newshape, order)
TypeError: an integer is required
Run Code Online (Sandbox Code Playgroud)

如果我用一个数字替换np.NaN,例如"2.3",上面的代码就有效.

绘制为两个单独的系列也不起作用(当我将包含NaN的系列添加到绘图中时失败):

C:\>\Python33x86\python.exe
Python 3.3.2 (v3.3.2:d047928ae3f6, May 16 2013, 00:03:43) [MSC v.1600 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import pandas as pd
>>> import numpy as np
>>> import matplotlib.pyplot as plt
>>>
>>> dates = pd.date_range('20131201', periods=5, freq='H')
>>> data = [[1, 2], [4, 5], [9, np.nan], [16, 17], [25, 26]]
>>> df = pd.DataFrame(data, index=dates,
...                       columns=list('AB'))
>>>
>>> print(df.to_string())
                      A   B
2013-12-01 00:00:00   1   2
2013-12-01 01:00:00   4   5
2013-12-01 02:00:00   9 NaN
2013-12-01 03:00:00  16  17
2013-12-01 04:00:00  25  26
>>> df['A'].plot(label='This is A', style='k')
<matplotlib.axes.AxesSubplot object at 0x02ACFF90>
>>> df['B'].plot(label='This is B', style='g')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Python33x86\lib\site-packages\pandas\tools\plotting.py", line 1730, in plot_series
    plot_obj.generate()
  File "C:\Python33x86\lib\site-packages\pandas\tools\plotting.py", line 856, in generate
    self._make_plot()
  File "C:\Python33x86\lib\site-packages\pandas\tools\plotting.py", line 1240, in _make_plot
    self._make_ts_plot(data, **self.kwds)
  File "C:\Python33x86\lib\site-packages\pandas\tools\plotting.py", line 1311, in _make_ts_plot
    _plot(data, 0, ax, label, self.style, **kwds)
  File "C:\Python33x86\lib\site-packages\pandas\tools\plotting.py", line 1295, in _plot
    style=style, **kwds)
  File "C:\Python33x86\lib\site-packages\pandas\tseries\plotting.py", line 77, in tsplot
    lines = plotf(ax, *args, **kwargs)
  File "C:\Python33x86\lib\site-packages\matplotlib\axes.py", line 4139, in plot
    for line in self._get_lines(*args, **kwargs):
  File "C:\Python33x86\lib\site-packages\matplotlib\axes.py", line 319, in _grab_next_args
    for seg in self._plot_args(remaining, kwargs):
  File "C:\Python33x86\lib\site-packages\matplotlib\axes.py", line 297, in _plot_args
    x, y = self._xy_from_xy(x, y)
  File "C:\Python33x86\lib\site-packages\matplotlib\axes.py", line 216, in _xy_from_xy
    by = self.axes.yaxis.update_units(y)
  File "C:\Python33x86\lib\site-packages\matplotlib\axis.py", line 1337, in update_units
    converter = munits.registry.get_converter(data)
  File "C:\Python33x86\lib\site-packages\matplotlib\units.py", line 137, in get_converter
    xravel = x.ravel()
  File "C:\Python33x86\lib\site-packages\numpy\ma\core.py", line 3969, in ravel
    r._mask = ndarray.ravel(self._mask).reshape(r.shape)
  File "C:\Python33x86\lib\site-packages\pandas\core\series.py", line 981, in reshape
    return ndarray.reshape(self, newshape, order)
TypeError: an integer is required
Run Code Online (Sandbox Code Playgroud)

但是,如果我直接使用Matplotlib的Pyplot plot(),而不是使用Pandas的plot()函数,它可以工作:

C:\>\Python33x86\python.exe
Python 3.3.2 (v3.3.2:d047928ae3f6, May 16 2013, 00:03:43) [MSC v.1600 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import pandas as pd
>>> import numpy as np
>>> import matplotlib.pyplot as plt
>>> dates = pd.date_range('20131201', periods=5, freq='H')
>>> plt.plot(dates, [1, 4, 9, 16, 25], 'k', dates, [2, 5, np.NAN, 17, 26], 'g')
[<matplotlib.lines.Line2D object at 0x03E98650>, <matplotlib.lines.Line2D object at 0x040929B0>]
>>> plt.show()
>>>
Run Code Online (Sandbox Code Playgroud)

所以似乎我有一个解决方法,但是当我绘制大型DataFrame时,我更喜欢使用Pandas的plot()方法,这更方便.我试图跟踪堆栈跟踪,但过了一段时间它变得复杂(我不熟悉Pandas,Numpy和Matplotlib源代码).我做错了什么,或者这是熊猫情节()中可能存在的错误?

谢谢您的帮助!

我尝试在Windows x86和Linux AMD64上使用这些版本获得相同的结果:

  • Python 3.3.2
  • 熊猫0.12.0
  • Matplotlib 1.3.1
  • Numpy 1.7.1

alk*_*lko 2

看来这是 matplotlib 1.3.1 与 pandas 0.12集成错误

解决方法是降级到 matplotlib 1.3.0。(但是请注意,此版本的 matplotlib 在具有非 ASCII 字体名称的字体的系统上包含一个错误,因此您可能需要选择您的问题!)。此降级将触发降级到 numpy 1.7.1,因此您应该(再次)升级到 numpy 1.8.0。这个错误应该在即将发布的 Pandas 中得到修复0.13。然而,Pandas0.13可能会破坏一些现有代码(因为 pandas.Series 不再是 numpy.ndarray 的子类),因此,至少在短期内,可能需要一些艰难的选择。

刚刚检查,代码与 matplotlib 一起工作正常1.3.0

>>> import matplotlib
>>> matplotlib.__version__
'1.3.0'
>>> df.plot()
<matplotlib.axes.AxesSubplot object at 0x04E8B4F0>
>>> plt.show(_)
Run Code Online (Sandbox Code Playgroud)

在此输入图像描述