Python 数据帧中的滚动和累积标准偏差

Question

Python 数据帧中的滚动和累积标准偏差

Roy*_*Roy 5 python dataframe standard-deviation pandas

是否有矢量化操作来计算 Python DataFrame 的累积和滚动标准偏差 (SD)？

例如，我想添加一个列 'c'，它根据列 'a' 计算累积 SD，即在索引 0 中，由于 1 个数据点，它显示 NaN，在索引 1 中，它根据 2 个数据计算 SD积分等等。

同样的问题也适用于滚动 SD。有没有一种有效的方法来计算而不通过 df.itertuples() 进行迭代？

import numpy as np
import pandas as pd

def main():
    np.random.seed(123)
    df = pd.DataFrame(np.random.randn(10, 2), columns=['a', 'b'])
    print(df)

if __name__ == '__main__':
    main()

Run Code Online (Sandbox Code Playgroud)

Answer 1

Sco*_*ton 9

对于基于 columna 'a' 的累积 SD，让我们使用rollingWindows 大小的数据帧长度和min_periods = 2：

df['a'].rolling(len(df),min_periods=2).std()

Run Code Online (Sandbox Code Playgroud)

输出：

          a         b         c
0 -1.085631  0.997345       NaN
1  0.282978 -1.506295  0.967753
2 -0.578600  1.651437  0.691916
3 -2.426679 -0.428913  1.133892
4  1.265936 -0.866740  1.395750
5 -0.678886 -0.094709  1.250335
6  1.491390 -0.638902  1.374933
7 -0.443982 -0.434351  1.274843
8  2.205930  2.186786  1.450563
9  1.004054  0.386186  1.403721

Run Code Online (Sandbox Code Playgroud)

对于一次基于两个值滚动 SD：

df['c'] = df['a'].rolling(2).std()

Run Code Online (Sandbox Code Playgroud)

输出：

          a         b         c
0 -1.085631  0.997345       NaN
1  0.282978 -1.506295  0.967753
2 -0.578600  1.651437  0.609228
3 -2.426679 -0.428913  1.306789
4  1.265936 -0.866740  2.611073
5 -0.678886 -0.094709  1.375197
6  1.491390 -0.638902  1.534617
7 -0.443982 -0.434351  1.368514
8  2.205930  2.186786  1.873771
9  1.004054  0.386186  0.849855

Run Code Online (Sandbox Code Playgroud)

Answer 2

Tom*_*dor 7

我认为，如果滚动意味着累积，那么 Pandas 中正确的术语是expanding：

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.expanding.html#pandas.DataFrame.expanding

它还接受一个min_periods参数。

df['c'] = df['a'].expanding(2).std()

Run Code Online (Sandbox Code Playgroud)

该案例rolling由 Scott Boston 处理，毫不奇怪，它被称为rollingPandas。

expandingover的好处rolling(len(df), ...)是，你不需要len提前知道。它非常有用，例如在groupby数据框中。

归档时间：	8 年，5 月前
查看次数：	4866 次
最近记录：	6 年，7 月前