使用库时缩短大型堆栈跟踪

Question

使用库时缩短大型堆栈跟踪

Lon*_*Rob 6 python stack-trace python-3.x

我经常与大型图书馆（例如pandas或matplotlib）合作。

这意味着异常通常会产生较长的堆栈跟踪。

由于该错误很少出现在库中，而错误经常出现在我自己的代码中，因此在大多数情况下，我不需要查看库的详细信息。

几个常见的例子：

大熊猫

>>> import pandas as pd
>>> df = pd.DataFrame(dict(a=[1,2,3]))
>>> df['b'] # Hint: there _is_ no 'b'

Run Code Online (Sandbox Code Playgroud)

在这里，我尝试访问未知密钥。这个简单的错误产生一个包含28行的stacktrace：

Traceback (most recent call last):
  File "an_arbitrary_python\lib\site-packages\pandas\core\indexes\base.py", line 2393, in get_loc
    return self._engine.get_loc(key)
  File "pandas\_libs\index.pyx", line 132, in pandas._libs.index.IndexEngine.get_loc (pandas\_libs\index.c:5239)
  File "pandas\_libs\index.pyx", line 154, in pandas._libs.index.IndexEngine.get_loc (pandas\_libs\index.c:5085)
  File "pandas\_libs\hashtable_class_helper.pxi", line 1207, in pandas._libs.hashtable.PyObjectHashTable.get_item (pandas\_libs\hashtable.c:20405)
  File "pandas\_libs\hashtable_class_helper.pxi", line 1215, in pandas._libs.hashtable.PyObjectHashTable.get_item (pandas\_libs\hashtable.c:20359)
KeyError: 'b'

During handling of the above exception, another exception occurred:

    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "an_arbitrary_python\lib\site-packages\pandas\core\frame.py", line 2062, in __getitem__
        return self._getitem_column(key)
      File "an_arbitrary_python\lib\site-packages\pandas\core\frame.py", line 2069, in _getitem_column
        return self._get_item_cache(key)
      File "an_arbitrary_python\lib\site-packages\pandas\core\generic.py", line 1534, in _get_item_cache
        values = self._data.get(item)
      File "an_arbitrary_python\lib\site-packages\pandas\core\internals.py", line 3590, in get
        loc = self.items.get_loc(item)
      File "an_arbitrary_python\lib\site-packages\pandas\core\indexes\base.py", line 2395, in get_loc
        return self._engine.get_loc(self._maybe_cast_indexer(key))
      File "pandas\_libs\index.pyx", line 132, in pandas._libs.index.IndexEngine.get_loc (pandas\_libs\index.c:5239)
      File "pandas\_libs\index.pyx", line 154, in pandas._libs.index.IndexEngine.get_loc (pandas\_libs\index.c:5085)
      File "pandas\_libs\hashtable_class_helper.pxi", line 1207, in pandas._libs.hashtable.PyObjectHashTable.get_item (pandas\_libs\hashtable.c:20405)
      File "pandas\_libs\hashtable_class_helper.pxi", line 1215, in pandas._libs.hashtable.PyObjectHashTable.get_item (pandas\_libs\hashtable.c:20359)
    KeyError: 'b'

Run Code Online (Sandbox Code Playgroud)

知道我最终的加入hashtable_class_helper.pxi对我几乎没有帮助。我需要知道我的代码在哪里搞砸了。

Matplotlib

>>> import matplotlib.pyplot as plt
>>> import matplotlib.cm as cm
>>> def foo():
...     plt.plot([1,2,3], cbap=cm.Blues) # cbap is a typo for cmap
...
>>> def bar():
...     foo()
...
>>> bar()

Run Code Online (Sandbox Code Playgroud)

这次，我的关键字参数中有一个错字。但是我仍然必须看到25行堆栈跟踪：

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 2, in bar
  File "<stdin>", line 2, in foo
  File "an_arbitrary_python\lib\site-packages\matplotlib\pyplot.py", line 3317, in plot
    ret = ax.plot(*args, **kwargs)
  File "an_arbitrary_python\lib\site-packages\matplotlib\__init__.py", line 1897, in inner
    return func(ax, *args, **kwargs)
  File "an_arbitrary_python\lib\site-packages\matplotlib\axes\_axes.py", line 1406, in plot
    for line in self._get_lines(*args, **kwargs):
  File "an_arbitrary_python\lib\site-packages\matplotlib\axes\_base.py", line 407, in _grab_next_args
    for seg in self._plot_args(remaining, kwargs):
  File "an_arbitrary_python\lib\site-packages\matplotlib\axes\_base.py", line 395, in _plot_args
    seg = func(x[:, j % ncx], y[:, j % ncy], kw, kwargs)
  File "an_arbitrary_python\lib\site-packages\matplotlib\axes\_base.py", line 302, in _makeline
    seg = mlines.Line2D(x, y, **kw)
  File "an_arbitrary_python\lib\site-packages\matplotlib\lines.py", line 431, in __init__
    self.update(kwargs)
  File "an_arbitrary_python\lib\site-packages\matplotlib\artist.py", line 885, in update
    for k, v in props.items()]
  File "an_arbitrary_python\lib\site-packages\matplotlib\artist.py", line 885, in <listcomp>
    for k, v in props.items()]
  File "an_arbitrary_python\lib\site-packages\matplotlib\artist.py", line 878, in _update_property
    raise AttributeError('Unknown property %s' % k)
AttributeError: Unknown property cbap

Run Code Online (Sandbox Code Playgroud)

在这里，我发现我结束于artist.py引发的行AttributeError，然后直接在其下方看到AttributeError确实引发了。就信息而言，这没有多少附加值。

在这些琐碎的交互式示例中，您可能只说了“看堆栈跟踪的顶部，而不是底部”，但是通常我的愚蠢的错字发生在一个函数中，因此我感兴趣的行位于这些杂乱无章的堆栈跟踪。

有什么办法可以使这些堆栈跟踪更简洁一些，并帮助我找到问题的根源，而问题的根源总是在我自己的代码中，而不是在我碰巧使用的库中？

Answer 1

Jon*_*tts 4

您可以使用回溯来更好地控制异常打印。例如：

import pandas as pd
import traceback

try:
    df = pd.DataFrame(dict(a=[1,2,3]))
    df['b']

except Exception, e:
    traceback.print_exc(limit=1)
    exit(1)

Run Code Online (Sandbox Code Playgroud)

这会触发标准异常打印机制，但仅显示堆栈跟踪的第一帧（这是您在示例中关心的帧）。对我来说这会产生：

Traceback (most recent call last):
  File "t.py", line 6, in <module>
    df['b']
KeyError: 'b'

Run Code Online (Sandbox Code Playgroud)

显然，您会丢失上下文，这在调试您自己的代码时非常重要。如果我们想要变得更奇特，我们可以尝试设计一个测试，看看回溯应该走多远。例如：

def find_depth(tb, continue_test):
    depth = 0

    while tb is not None:
        filename = tb.tb_frame.f_code.co_filename

        # Run the test we're given against the filename
        if not continue_test(filename):
            return depth

        tb = tb.tb_next
        depth += 1

Run Code Online (Sandbox Code Playgroud)

我不知道你如何组织和运行你的代码，但也许你可以这样做：

import pandas as pd
import traceback
import sys

def find_depth():
    # ... code from above here ...

try:
    df = pd.DataFrame(dict(a=[1, 2, 3]))
    df['b']

except Exception, e:
    traceback.print_exc(limit=get_depth(
        sys.exc_info()[2],
        # The test for which frames we should include
        lambda filename: filename.startswith('my_module')
    ))
    exit(1)

Run Code Online (Sandbox Code Playgroud)

归档时间：	8 年，6 月前
查看次数：	206 次
最近记录：	8 年，6 月前