ves*_*and 5 python optimization numpy scipy
背景:
我想解决一系列优化问题,例如投资组合中的资产权重,以及交易策略中的参数,其中变量也传递给包含一堆其他变量的函数.
到目前为止,我已经能够使用Solver Add-In在Excel中轻松完成这些工作.但我认为使用Python会更有效,甚至可以更广泛地应用.为了清楚起见,我将把问题归结为投资组合优化的本质.
我的问题(简短版):
这是一个数据框和一个带有资产回报的对应图.
数据帧1:
A1 A2
2017-01-01 0.0075 0.0096
2017-01-02 -0.0075 -0.0033
.
.
2017-01-10 0.0027 0.0035
Run Code Online (Sandbox Code Playgroud)
图1 - 资产回报
基于此,我想找到最佳投资组合的权重risk / return (Sharpe ratio),由下图中的绿点表示(红点是所谓的最小方差投资组合,并代表另一个优化问题).
图2 - 高效的前沿和最佳投资组合:
细节:
以下代码部分包含returns()用于构建具有两个资产的随机返回的数据框的函数,以及pf_sharpe用于计算返回投资组合的两个给定权重的夏普比率的函数.
# imports
import pandas as pd
import numpy as np
from scipy.optimize import minimize
import matplotlib.pyplot as plt
np.random.seed(1234)
# Reproducible data sample
def returns(rows, names):
''' Function to create data sample with random returns
Parameters
==========
rows : number of rows in the dataframe
names: list of names to represent assets
Example
=======
>>> returns(rows = 2, names = ['A', 'B'])
A B
2017-01-01 0.0027 0.0075
2017-01-02 -0.0050 -0.0024
'''
listVars= names
rng = pd.date_range('1/1/2017', periods=rows, freq='D')
df_temp = pd.DataFrame(np.random.randint(-100,100,size=(rows, len(listVars))), columns=listVars)
df_temp = df_temp.set_index(rng)
df_temp = df_temp / 10000
return df_temp
# Sharpe ratio
def pf_sharpe(df, w1, w2):
''' Function to calculate risk / reward ratio
based on a pandas dataframe with two return series
Parameters
==========
df : pandas dataframe
w1 : portfolio weight for asset 1
w2 : portfolio weight for asset 2
'''
weights = [w1,w2]
# Calculate portfolio returns and volatility
pf_returns = (np.sum(df.mean() * weights) * 252)
pf_volatility = (np.sqrt(np.dot(np.asarray(weights).T, np.dot(df.cov() * 252, weights))))
# Calculate sharpe ratio
pf_sharpe = pf_returns / pf_volatility
return pf_sharpe
# Make df with random returns and calculate
# sharpe ratio for a 80/20 split between assets
df_returns = returns(rows = 10, names = ['A1', 'A2'])
df_returns.plot(kind = 'bar')
sharpe = pf_sharpe(df = df_returns, w1 = 0.8, w2 = 0.2)
print(sharpe)
# Output:
# 5.09477512073
Run Code Online (Sandbox Code Playgroud)
现在我想找到优化夏普比率的投资组合权重.我想你可以表达如下优化问题:
maximize:
pf_sharpe()
by changing:
w1, w2
under the constraints:
0 < w1 < 1
0 < w2 < 1
w1 + w2 = 1
Run Code Online (Sandbox Code Playgroud)
到目前为止我尝试过的:
我在Python Scipy Optimization.minimize中发现了一个可能的设置,使用SLSQP显示最大化的结果.以下是我到目前为止,它直接解决了我的问题的核心方面:
[...]将变量传递给包含一堆其他变量的函数.
正如您所看到的,我的初始挑战阻止我甚至测试我的边界和约束是否会被函数接受optimize.minimize().我甚至不打算考虑这是一个最大化而不是最小化问题的事实(希望通过改变函数的符号来修改).
尝试:
# bounds
b = (0,1)
bnds = (b,b)
# constraints
def constraint1(w1,w2):
return w1 - w2
cons = ({'type': 'eq', 'fun':constraint1})
# initial guess
x0 = [0.5, 0.5]
# Testing the initial guess
print(pf_sharpe(df = df_returns, weights = x0))
# Optimization attempts
attempt1 = optimize.minimize(pf_sharpe(), x0, method = 'SLSQP', bounds = bnds, constraints = cons)
attempt2 = optimize.minimize(pf_sharpe(df = df_returns, weights), x0, method = 'SLSQP', bounds = bnds, constraints = cons)
attempt3 = optimize.minimize(pf_sharpe(weights, df = df_returns), x0, method = 'SLSQP', bounds = bnds, constraints = cons)
Run Code Online (Sandbox Code Playgroud)
结果:
df还是weights已被指定.SyntaxError: positional argument follows keyword argumentNameError: name 'weights' is not defined我是那个的印象df可以自由指定,并x0在optimize.minimize将被视为变量将作为由指定在函数中的权重"代表"测试pf_sharpe().
你肯定明白,在这方面我从Excel到Python的过渡并不是最简单的,而且我在这里有很多不明白的地方.无论如何,我希望你们中的一些人可以提供一些建议或澄清!
谢谢!
附录1 - 模拟方法:
通过模拟一堆投资组合权重,可以轻松解决这一特定的投资组合优化问题.我正是这样做的,以产生上面的投资组合图.如果有人感兴趣,这是整个功能:
# Portfolio simulation
def portfolioSim(df, simRuns):
''' Function to take a df with asset returns,
runs a number of simulated portfolio weights,
plots return and risk for those weights,
and finds minimum risk portfolio
and max risk / return portfolio
Parameters
==========
df : pandas dataframe with returns
simRuns : number of simulations
'''
prets = []
pvols = []
pwgts = []
names = list(df_returns)
for p in range (simRuns):
# Assign random weights
weights = np.random.random(len(list(df_returns)))
weights /= np.sum(weights)
weights = np.asarray(weights)
# Calculate risk and returns with random weights
prets.append(np.sum(df_returns.mean() * weights) * 252)
pvols.append(np.sqrt(np.dot(weights.T, np.dot(df_returns.cov() * 252, weights))))
pwgts.append(weights)
prets = np.array(prets)
pvols = np.array(pvols)
pwgts = np.array(pwgts)
pshrp = prets / pvols
# Store calculations in a df
df1 = pd.DataFrame({'return':prets})
df2 = pd.DataFrame({'risk':pvols})
df3 = pd.DataFrame(pwgts)
df3.columns = names
df4 = pd.DataFrame({'sharpe':pshrp})
df_temp = pd.concat([df1, df2, df3, df4], axis = 1)
# Plot resulst
plt.figure(figsize=(8, 4))
plt.scatter(pvols, prets, c=prets / pvols, cmap = 'viridis', marker='o')
# Min risk
min_vol_port = df_temp.iloc[df_temp['risk'].idxmin()]
plt.plot([min_vol_port['risk']], [min_vol_port['return']], marker='o', markersize=12, color="red")
# Max sharpe
max_sharpe_port = df_temp.iloc[df_temp['sharpe'].idxmax()]
plt.plot([max_sharpe_port['risk']], [max_sharpe_port['return']], marker='o', markersize=12, color="green")
# Test run
portfolioSim(df = df_returns, simRuns = 250)
Run Code Online (Sandbox Code Playgroud)
附录2 - Excel求解器方法:
以下是使用Excel Solver解决问题的方法.我没有附加截图,而是在代码部分中包含了最重要的公式,而不是链接到文件.我猜你们中很多人都不会有兴趣再现这个.但我把它包括在内只是为了表明它可以在Excel中轻松完成.灰度范围代表公式.可以更改并在优化问题中用作参数的范围以黄色突出显示.绿色范围是目标函数.
这是工作表和Solver设置的图像:
Excel公式:
C3 =AVERAGE(C7:C16)
C4 =AVERAGE(D7:D16)
H4 =COVARIANCE.P(C7:C16;D7:D16)
G5 =COVARIANCE.P(C7:C16;D7:D16)
G10 =G8+G9
G13 =MMULT(TRANSPOSE(G8:G9);C3:C4)
G14 =SQRT(MMULT(TRANSPOSE(G8:G9);MMULT(G4:H5;G8:G9)))
H13 =G12/G13
H14 =G13*252
G16 =G13/G14
H16 =H13/H14
Run Code Online (Sandbox Code Playgroud)
结束说明:
从截图中可以看出,Excel求解器建议47% / 53%在A1和A2之间进行分割,以获得最佳的夏普比率为5,6.运行Python函数sr_opt = portfolioSim(df = df_returns, simRuns = 25000)产生的夏普比率为5,3,相应的46% and 53%A1和A2 权重:
print(sr_opt)
#Output
#return 0.361439
#risk 0.067851
#A1 0.465550
#A2 0.534450
#sharpe 5.326933
Run Code Online (Sandbox Code Playgroud)
Excel中应用的方法是GRG Nonlinear.我知道将SLSQP参数更改为非线性方法会让我处于某个地方,而且我也会研究scipy中的非线性求解器,但收效甚微.也许Scipy甚至不是这里最好的选择?
更详细的答案,代码的第一部分保持不变
import pandas as pd
import numpy as np
from scipy.optimize import minimize
import matplotlib.pyplot as plt
np.random.seed(1234)
# Reproducible data sample
def returns(rows, names):
''' Function to create data sample with random returns
Parameters
==========
rows : number of rows in the dataframe
names: list of names to represent assets
Example
=======
>>> returns(rows = 2, names = ['A', 'B'])
A B
2017-01-01 0.0027 0.0075
2017-01-02 -0.0050 -0.0024
'''
listVars= names
rng = pd.date_range('1/1/2017', periods=rows, freq='D')
df_temp = pd.DataFrame(np.random.randint(-100,100,size=(rows, len(listVars))), columns=listVars)
df_temp = df_temp.set_index(rng)
df_temp = df_temp / 10000
return df_temp
Run Code Online (Sandbox Code Playgroud)
函数pf_sharpe被修改,第一个输入是权重之一,即要优化的参数。w1 + w2 = 1我们可以定义w2为1-w1inside ,而不是输入constraint pf_sharpe,它完全等效,但更简单、更快。另外,minimize将尝试最小化pf_sharpe,而您实际上想要最大化它,因此现在 的输出pf_sharpe乘以 -1。
# Sharpe ratio
def pf_sharpe(weight, df):
''' Function to calculate risk / reward ratio
based on a pandas dataframe with two return series
'''
weights = [weight[0], 1-weight[0]]
# Calculate portfolio returns and volatility
pf_returns = (np.sum(df.mean() * weights) * 252)
pf_volatility = (np.sqrt(np.dot(np.asarray(weights).T, np.dot(df.cov() * 252, weights))))
# Calculate sharpe ratio
pf_sharpe = pf_returns / pf_volatility
return -pf_sharpe
# initial guess
x0 = [0.5]
df_returns = returns(rows = 10, names = ['A1', 'A2'])
# Optimization attempts
out = minimize(pf_sharpe, x0, method='SLSQP', bounds=[(0, 1)], args=(df_returns,))
optimal_weights = [out.x, 1-out.x]
print(optimal_weights)
print(-pf_sharpe(out.x, df_returns))
Run Code Online (Sandbox Code Playgroud)
这会返回优化的夏普比率 6.16(优于 5.3),w1 几乎为 1,w2 几乎为 0