获取熊猫数据框最近N个工作日的平均值

Question

获取熊猫数据框最近N个工作日的平均值

假设我的数据是每日计数，并且有一个DateTimeIndex列作为其索引。是否可以获取过去n个工作日的平均值？例如，如果日期是8月15日（星期日），我想获取计数的平均值（8月8日星期日，8月1日星期日，...）。

我昨天开始使用熊猫，所以这是我强行使用的方法。

# df is a dataframe with an DateTimeIndex
# brute force for count last n weekdays, wherelnwd = last n weekdays
def lnwd(n=1):
    lnwd, tmp = df.shift(7), df.shift(7) # count last weekday
    for i in xrange(n-1):
        tmp = tmp.shift(7)
        lnwd += tmp
    lnwd = lnwd/n  # average
    return lnwd

Run Code Online (Sandbox Code Playgroud)

必须有一个班轮吗？有没有一种使用方式apply()（不传递具有for循环的函数？因为n是可变的）groupby？例如，查找每个工作日所有数据均值的方法是：

df.groupby(lambda x: x.dayofweek).mean() # mean of each MTWHFSS

Run Code Online (Sandbox Code Playgroud)

Answer 1

jor*_*ris 5

我认为您正在寻找滚动申请（在这种情况下是滚动的意思）？请参阅文档：http : //pandas.pydata.org/pandas-docs/stable/computation.html#moving-rolling-statistics-moments。但是然后每个工作日单独申请，这可以通过结合rolling_mean工作日分组来实现groupby。

这应该给出类似（带有系列s）的内容：

s.groupby(s.index.weekday).transform(lambda x: pd.rolling_mean(x, window=n))

Run Code Online (Sandbox Code Playgroud)

归档时间：	11 年，5 月前
查看次数：	944 次
最近记录：	11 年，5 月前