具有多索引的 Pandas 点积

Question

具有多索引的 Pandas 点积

FLa*_*Lab 6 python numpy financial pandas

我的问题在金融领域很常见。

给定一个权重数组 w (1xN) 和资产的协方差矩阵 Q (NxN)，可以使用二次表达式 w' * Q * w 计算投资组合的协方差，其中 * 是点积。

当我有权重 W (T x N) 的历史和协方差矩阵 (T, N, N) 的 3D 结构时，我想了解执行此操作的最佳方法是什么。

import numpy as np
import pandas as pd

returns = pd.DataFrame(0.1 * np.random.randn(100, 4), columns=['A', 'B', 'C', 'D'])
covariance = returns.rolling(20).cov()

weights = pd.DataFrame(np.random.randn(100, 4), columns=['A', 'B', 'C', 'D'])

Run Code Online (Sandbox Code Playgroud)

到目前为止，我的解决方案是将 Pandas DataFrames 转换为 numpy，执行循环计算，然后转换回 Pandas。请注意，我需要明确检查标签的对齐方式，因为实际上协方差和权重可以由不同的过程计算。

cov_dict = {key: covariance.xs(key, axis=0, level=0) for key in covariance.index.get_level_values(0)}

def naive_numpy(weights, cov_dict):

    expected_risk = {}

    # Extract columns, index before passing to numpy arrays
    # Columns
    cov_assets = cov_dict[next(iter(cov_dict))].columns
    avail_assets = [el for el in cov_assets if el in weights]

    # Indexes
    cov_dates = list(cov_dict.keys())
    avail_dates = weights.index.intersection(cov_dates)

    sel_weights = weights.loc[avail_dates, avail_assets]

    # Main loop and calculation
    for t, value in zip(sel_weights.index, sel_weights.values):
        expected_risk[t] = np.sqrt(np.dot(value, np.dot(cov_dict[t].values, value)))

    # Back to pandas DataFrame
    expected_risk = pd.Series(expected_risk).reindex(weights.index).sort_index()

    return expected_risk

Run Code Online (Sandbox Code Playgroud)

有没有纯熊猫方式来达到同样的结果？或者是否对代码进行了任何改进以使其更高效？（尽管使用 numpy，它仍然很慢）。

Answer 1

eco*_*zar 3

我认为 numpy 绝对是最好的选择。尽管如果您循环使用值/日期，您会失去效率。

我对计算投资组合滚动波动率的建议（无循环）：

returns = pd.DataFrame(0.1 * np.random.randn(100, 4), columns=['A', 'B', 'C', 'D'])
covariance = returns.rolling(20).cov()
weights = pd.DataFrame(np.random.randn(100, 4), columns=['A', 'B', 'C', 'D'])

rows, columns = weights.shape

# Go to numpy:
w = weights.values
cov = covariance.values.reshape(rows, columns, columns)

A = np.matmul(w.reshape(rows, 1, columns), cov)
var = np.matmul(A, w.reshape(rows, columns, 1)).reshape(rows)
std_dev = np.sqrt(var)

# Back to pandas (in case you want that):
pd.Series(std_dev, index = weights.index)

Run Code Online (Sandbox Code Playgroud)

归档时间：	8 年前
查看次数：	658 次
最近记录：	6 年，12 月前