递归:具有分布的帐户值

Bra*_*mon 15 python recursion finance python-3.x pandas

更新:不确定在没有某种形式的循环的情况下是否可行,但np.where在此处不起作用.如果答案是"你不能",那就这样吧.如果可以,它可以使用来自的东西scipy.signal.


我想在下面的代码中对循环进行矢量化,但由于输出的递归性质,不确定如何.

走我当前的设置:

获取起始金额(100万美元)和季度美元分配(5,000美元):

dist = 5000.
v0 = float(1e6)
Run Code Online (Sandbox Code Playgroud)

在每月频率生成一些随机安全/帐户返回(十进制形式):

r = pd.Series(np.random.rand(12) * .01,
              index=pd.date_range('2017', freq='M', periods=12))
Run Code Online (Sandbox Code Playgroud)

创建一个包含月度帐户值的空系列:

value = pd.Series(np.empty_like(r), index=r.index)
Run Code Online (Sandbox Code Playgroud)

添加"开始月份" value.这个标签将包含v0.

from pandas.tseries import offsets
value = (value.append(Series(v0, index=[value.index[0] - offsets.MonthEnd(1)]))
              .sort_index())
Run Code Online (Sandbox Code Playgroud)

我想摆脱的循环在这里:

for date in value.index[1:]:
    if date.is_quarter_end:
        value.loc[date] = value.loc[date - offsets.MonthEnd(1)] \
                        * (1 + r.loc[date]) - dist
    else:
        value.loc[date] = value.loc[date - offsets.MonthEnd(1)] \
                        * (1 + r.loc[date]) 
Run Code Online (Sandbox Code Playgroud)

合并代码:

import pandas as pd
from pandas.tseries import offsets
from pandas import Series
import numpy as np

dist = 5000.
v0 = float(1e6)
r = pd.Series(np.random.rand(12) * .01, index=pd.date_range('2017', freq='M', periods=12))
value = pd.Series(np.empty_like(r), index=r.index)
value = (value.append(Series(v0, index=[value.index[0] - offsets.MonthEnd(1)])).sort_index())
for date in value.index[1:]:
    if date.is_quarter_end:
        value.loc[date] = value.loc[date - offsets.MonthEnd(1)] * (1 + r.loc[date]) - dist
    else:
        value.loc[date] = value.loc[date - offsets.MonthEnd(1)] * (1 + r.loc[date]) 
Run Code Online (Sandbox Code Playgroud)

在psuedocode中,循环正在做的只是:

for each date in index of value:
    if the date is not a quarter end:
        multiply previous value by (1 + r) for that month
    if the date is a quarter end:
        multiply previous value by (1 + r) for that month and subtract dist
Run Code Online (Sandbox Code Playgroud)

问题是,我目前没有看到矢量化是如何可能的,因为连续值取决于是否在前一个月进行了分配.我得到了预期的结果,但对于更高频率的数据或更长的时间段效率非常低. 在此输入图像描述

Jun*_*sor 9

您可以使用以下代码:

cum_r = (1 + r).cumprod()
result = cum_r * v0
for date in r.index[r.index.is_quarter_end]:
     result[date:] -= cum_r[date:] * (dist / cum_r.loc[date])
Run Code Online (Sandbox Code Playgroud)

你会做:

  • 所有月回报的累积产品.
  • 1个矢量乘法与标量v0
  • n 矢量乘法与标量 dist / cum_r.loc[date]
  • n 矢量减法

n季度末数在哪里.

基于此代码,我们可以进一步优化:

cum_r = (1 + r).cumprod()
t = (r.index.is_quarter_end / cum_r).cumsum()
result = cum_r * (v0 - dist * t)
Run Code Online (Sandbox Code Playgroud)

是的

  • 1累积产品 (1 + r).cumprod()
  • 两个系列之间的1个划分 r.index.is_quarter_end / cum_r
  • 1上述分区的累积总和
  • 1乘以上述和与标量 dist
  • 1个减法标量的v0dist * t
  • 的1个dotwise乘法cum_rv0 - dist * t


mor*_*rty 6

好的......我正在捅这个.

import numpy as np 
import pandas as pd

#Define a generator for accumulating deposits and returns
def gen(lst):
    acu = 0
    for r, v in lst:
        yield acu * (1 + r) +v
        acu *= (1 + r)
        acu += v


dist = 5000.
v0 = float(1e6)
random_returns = np.random.rand(12) * 0.1

#Create the index. 
index=pd.date_range('2016-12-31', freq='M', periods=13)
#Generate a return so that the value at i equals the return from i-1 to i
r = pd.Series(np.insert(random_returns, 0,0), index=index, name='Return')
#Generate series with deposits and withdrawals
w = [-dist if is_q_end else 0 for is_q_end in index [1:].is_quarter_end]
d = pd.Series(np.insert(w, 0, v0), index=index, name='Movements')

df = pd.concat([r, d], axis=1)
df['Value'] = list(gen(zip(df['Return'], df['Movements'])))
Run Code Online (Sandbox Code Playgroud)

现在,你的代码

#Generate some random security/account returns (decimal form) at monthly freq:
r = pd.Series(random_returns,
          index=pd.date_range('2017', freq='M', periods=12))
#Create an empty Series that will hold the monthly account values:
value = pd.Series(np.empty_like(r), index=r.index)
#Add a "start month" to value. This label will contain v0.
from pandas.tseries import offsets
value = (value.append(pd.Series(v0, index=[value.index[0] - offsets.MonthEnd(1)])).sort_index())
#The loop I'd like to get rid of is here:

def loopy(value) :
    for date in value.index[1:]:
        if date.is_quarter_end:
            value.loc[date] = value.loc[date - offsets.MonthEnd(1)] \
                           * (1 + r.loc[date]) - dist
        else:
           value.loc[date] = value.loc[date - offsets.MonthEnd(1)] \
                           * (1 + r.loc[date]) 

   return value
Run Code Online (Sandbox Code Playgroud)

和比较和时间

(loopy(value)==list(gen(zip(r, d)))).all()
Out[11]: True
Run Code Online (Sandbox Code Playgroud)

返回相同的结果

%timeit list(gen(zip(r, d)))
%timeit loopy(value)
10000 loops, best of 3: 72.4 µs per loop
100 loops, best of 3: 5.37 ms per loop
Run Code Online (Sandbox Code Playgroud)

并且看起来有点快.希望能帮助到你.