Wei*_* Wu 4 python optimization portfolio scipy pandas
在这里尝试优化投资组合权重分配,以通过限制风险最大化我的收益函数。我没有问题,可以通过简单的约束(找到所有权重的总和等于1)来找到产生给我的收益函数的最优权重,并使我的总风险低于目标风险的另一个约束是没有问题的。
我的问题是,如何为每个组添加行业权重界限?
我的代码如下:
# -*- coding: utf-8 -*-
import pandas as pd
import numpy as np
import scipy.optimize as sco
dates = pd.date_range('1/1/2000', periods=8)
industry = ['industry', 'industry', 'utility', 'utility', 'consumer']
symbols = ['A', 'B', 'C', 'D', 'E']
zipped = list(zip(industry, symbols))
index = pd.MultiIndex.from_tuples(zipped)
noa = len(symbols)
data = np.array([[10, 9, 10, 11, 12, 13, 14, 13],
[11, 11, 10, 11, 11, 12, 11, 10],
[10, 11, 10, 11, 12, 13, 14, 13],
[11, 11, 10, 11, 11, 12, 11, 11],
[10, 11, 10, 11, 12, 13, 14, 13]])
market_to_market_price = pd.DataFrame(data.T, index=dates, columns=index)
rets = market_to_market_price / market_to_market_price.shift(1) - 1.0
rets = rets.dropna(axis=0, how='all')
expo_factor = np.ones((5,5))
factor_covariance = market_to_market_price.cov()
delta = np.diagflat([0.088024, 0.082614, 0.084237, 0.074648,
0.084237])
cov_matrix = np.dot(np.dot(expo_factor, factor_covariance),
expo_factor.T) + delta
def calculate_total_risk(weights, cov_matrix):
port_var = np.dot(np.dot(weights.T, cov_matrix), weights)
return port_var
def max_func_return(weights):
return -np.sum(rets.mean() * weights)
# optimized return with given risk
tolerance_risk = 27
noa = market_to_market_price.shape[1]
cons = ({'type': 'eq', 'fun': lambda x: np.sum(x) - 1},
{'type': 'eq', 'fun': lambda x: calculate_total_risk(x, cov_matrix) - tolerance_risk})
bnds = tuple((0, 1) for x in range(noa))
init_guess = noa * [1. / noa,]
opts_mean = sco.minimize(max_func_return, init_guess, method='SLSQP',
bounds=bnds, constraints=cons)
In [88]: rets
Out[88]:
industry utility consumer
A B C D E
2000-01-02 -0.100000 0.000000 0.100000 0.000000 0.100000
2000-01-03 0.111111 -0.090909 -0.090909 -0.090909 -0.090909
2000-01-04 0.100000 0.100000 0.100000 0.100000 0.100000
2000-01-05 0.090909 0.000000 0.090909 0.000000 0.090909
2000-01-06 0.083333 0.090909 0.083333 0.090909 0.083333
2000-01-07 0.076923 -0.083333 0.076923 -0.083333 0.076923
2000-01-08 -0.071429 -0.090909 -0.071429 0.000000 -0.071429
In[89]: opts_mean['x'].round(3)
Out[89]: array([ 0.233, 0.117, 0.243, 0.165, 0.243])
Run Code Online (Sandbox Code Playgroud)
如何添加这样的组边界,使得5个资产的总和落入边界以下?
model = pd.DataFrame(np.array([.08,.12,.05]), index= set(industry), columns = ['strategic'])
model['tactical'] = [(.05,.41), (.2,.66), (0,.16)]
In [85]: model
Out[85]:
strategic tactical
industry 0.08 (0.05, 0.41)
consumer 0.12 (0.2, 0.66)
utility 0.05 (0, 0.16)
Run Code Online (Sandbox Code Playgroud)
我已经阅读了类似的SciPy优化后的分组边界优化方法,但是仍然无法获得任何线索,任何身体都可以帮助吗?谢谢。
首先,考虑使用cvxopt,这是专为凸优化设计的模块。我不太熟悉,但是这里有一个有效的前沿例子。
现在解决您的问题,以下是一种变通方法,该变通方法特别适用于您发布和使用的问题minimize。(可以普遍使用它来在输入类型和用户友好性上创建更多的灵活性,并且基于类的实现在这里也将很有用。)
关于您的问题“如何添加组边界?”,简短的答案是您实际上需要通过constraints而不是bounds参数来完成此操作,因为
可选地,还可以使用bounds参数指定x中每个元素的上下限。[重点添加]
该规范与您要执行的操作不匹配。下面的示例所做的是,分别为每个组的上限和下限添加一个不等式约束。该函数mapto_constraints返回将添加到当前约束的字典列表。
首先,这里是一些示例数据:
import pandas as pd
import numpy as np
import numpy.random as npr
npr.seed(123)
from scipy.optimize import minimize
# Create a DataFrame of hypothetical returns for 5 stocks across 3 industries,
# at daily frequency over a year. Note that these will be in decimal
# rather than numeral form. (i.e. 0.01 denotes a 1% return)
dates = pd.bdate_range(start='1/1/2000', end='12/31/2000')
industry = ['industry'] * 2 + ['utility'] * 2 + ['consumer']
symbols = list('ABCDE')
zipped = list(zip(industry, symbols))
cols = pd.MultiIndex.from_tuples(zipped)
returns = pd.DataFrame(npr.randn(len(dates), len(cols)), index=dates, columns=cols)
returns /= 100 + 3e-3 #drift term
returns.head()
Out[191]:
industry utility consumer
A B C D E
2000-01-03 -0.01484 0.00986 -0.00476 0.00235 -0.00630
2000-01-04 0.00518 0.00958 -0.01210 -0.00814 -0.01664
2000-01-05 0.00233 -0.01665 -0.00366 0.00520 0.02058
2000-01-06 0.00368 0.01253 0.00259 0.00309 -0.00211
2000-01-07 -0.00383 0.01174 0.00375 0.00336 -0.00608
Run Code Online (Sandbox Code Playgroud)
您会看到年度化数字“有意义”:
(1 + returns.mean()) ** 252 - 1
Out[199]:
industry A -0.05531
B 0.32455
utility C 0.10979
D 0.14339
consumer E -0.12644
Run Code Online (Sandbox Code Playgroud)
现在介绍一些将在优化中使用的功能。这些是根据Yves Hilpisch的《Python for Finance》第11章中的示例紧密建模的。
def logrels(rets):
"""Log of return relatives, ln(1+r), for a given DataFrame rets."""
return np.log(rets + 1)
def statistics(weights, rets):
"""Compute expected portfolio statistics from individual asset returns.
Parameters
==========
rets : DataFrame
Individual asset returns. Use numeral rather than decimal form
weights : array-like
Individual asset weights, nx1 vector.
Returns
=======
list of (pret, pvol, pstd); these are *per-period* figures (not annualized)
pret : expected portfolio return
pvol : expected portfolio variance
pstd : expected portfolio standard deviation
Note
====
Note that Modern Portfolio Theory (MPT), being a single-period model,
works with (optimizes using) continuously compounded returns and
volatility, using log return relatives. The difference between these and
more commonly used geometric means will be negligible for small returns.
"""
if isinstance(weights, (tuple, list)):
weights = np.array(weights)
pret = np.sum(logrels(rets).mean() * weights)
pvol = np.dot(weights.T, np.dot(logrels(rets).cov(), weights))
pstd = np.sqrt(pvol)
return [pret, pvol, pstd]
# The below are a few convenience functions around statistics() above, needed
# because scipy minimize must optimize a function that returns a scalar
def port_ret(weights, rets):
return -1 * statistics(weights=weights, rets=rets)[0]
def port_variance(weights, rets):
return statistics(weights=weights, rets=rets)[1]
Run Code Online (Sandbox Code Playgroud)
这是等权重投资组合的预期年度标准差。我只是在这里提供它用作优化(risk_tol参数)的锚点。
statistics([0.2] * 5, returns)[2] * np.sqrt(252) # ew anlzd stdev
Out[192]: 0.06642120658640735
Run Code Online (Sandbox Code Playgroud)
下一个函数采用一个看起来像您的modelDataFrame的DataFrame,并为每个组建立约束。请注意,这非常不灵活,因为您将需要遵循特定格式来model使用现在使用的返回值和DataFrame。
def mapto_constraints(rets, model):
tactical = model['tactical'].to_dict() # values are tuple bounds
industries = rets.columns.get_level_values(0)
group_cons = list()
for key in tactical:
if isinstance(industries.get_loc('consumer'), int):
pos = [industries.get_loc(key)]
else:
pos = np.where(industries.get_loc(key))[0].tolist()
lb = tactical[key][0]
ub = tactical[key][1] # upper and lower bounds
lbdict = {'type': 'ineq',
'fun': lambda x: np.sum(x[pos[0]:(pos[-1] + 1)]) - lb}
ubdict = {'type': 'ineq',
'fun': lambda x: ub - np.sum(x[pos[0]:(pos[-1] + 1)])}
group_cons.append(lbdict); group_cons.append(ubdict)
return group_cons
Run Code Online (Sandbox Code Playgroud)
关于上面如何构建约束的说明:
等式约束表示约束函数结果为零,而不等式表示约束函数结果为非负数。
最后,优化本身:
def opt(rets, risk_tol, model, round=3):
noa = len(rets.columns)
guess = noa * [1. / noa,] # equal-weight; needed for initial guess
bnds = tuple((0, 1) for x in range(noa))
cons = [{'type': 'eq', 'fun': lambda x: np.sum(x) - 1.},
{'type': 'ineq', 'fun': lambda x: risk_tol - port_variance(x, rets=rets)}
] + mapto_constraints(rets=rets, model=model)
opt = minimize(port_ret, guess, args=(returns,), method='SLSQP', bounds=bnds,
constraints=cons, tol=1e-10)
return opt.x.round(round)
model = pd.DataFrame(np.array([.08,.12,.05]),
index= set(industry), columns = ['strategic'])
model['tactical'] = [(.05,.41), (.2,.66), (0,.16)]
# Set variance threshold equal to the equal-weighted variance
# Note that I set variance as an inequality rather than equality (i.e.
# resulting variance should be less than threshold).
opt(returns, risk_tol=port_variance([0.2] * 5, returns), model=model)
Out[195]: array([ 0.188, 0.225, 0.229, 0.197, 0.16 ])
Run Code Online (Sandbox Code Playgroud)