Pandas：为什么 DataFrame.apply(f,axis=1) 在 DataFrame 为空时调用 f？

Question

Pandas：为什么 DataFrame.apply(f,axis=1) 在 DataFrame 为空时调用 f？

为什么 Pandas 的DataFrame.apply方法在DataFrame为空时调用正在应用的函数？

例如：

>>> import pandas as pd
>>> df = pd.DataFrame({"foo": []})
>>> df
Empty DataFrame
Columns: [foo]
Index: []
>>> x = []
>>> df.apply(x.append, axis=1)
Series([], dtype: float64)
>>> x
[Series([], dtype: float64)] # <<< why was the apply callback called with an empty row?

Run Code Online (Sandbox Code Playgroud)

Answer 1

Dav*_*ver 4

深入研究 Pandas 源代码，看起来这就是罪魁祸首：

if not all(self.shape):
    # How to determine this better?
    is_reduction = False
    try:
        is_reduction = not isinstance(f(_EMPTY_SERIES), Series)
    except Exception:
        pass

    if is_reduction:
        return Series(NA, index=self._get_agg_axis(axis))
    else:
        return self.copy()

Run Code Online (Sandbox Code Playgroud)

看起来 Pandas 正在调用不带参数的函数，试图猜测结果应该是 aSeries还是 a DataFrame。

我想补丁已经准备好了。

编辑：此问题已被修补，现在已记录并允许reduce使用选项来避免它：http://pandas.pydata.org/pandas-docs/dev/ generated/pandas.DataFrame.apply.html

归档时间：	11 年，10 月前
查看次数：	631 次
最近记录：	9 年，7 月前