NumPy与MATLAB

Ler*_*ler 1 python matlab numpy

我开始使用NumPy而不是MATLAB来做很多事情,对于大多数事情来说,它看起来要快得多.我刚刚尝试在Python中复制代码,但速度要慢得多.我想知道是否有人知道两者都可以看看它,看看为什么会这样

NumPy的:

longTicker = np.empty([1,len(ticker)],dtype='U15')
genericTicker = np.empty([len(ticker)],dtype='U15')
tickerType = np.empty([len(ticker)],dtype='U10')
tickerList = np.vstack((np.empty([2,len(ticker)],dtype='U30'),np.ones([len(ticker)],dtype='U30')))
tickerListnum = 0
modelList = np.empty([2,9999],dtype='U2')
modelListnum = 0
derivativeType = np.ones(len(ticker))

for l in range(0,len(ticker)):
    tickerType[l] = 'Future'

    if not modCode[l] in list(modelList[1,:]):
        modelList[0,modelListnum] = modelListnum + 1
        modelList[1,modelListnum] = modCode[l]
        modelListnum += 1

    if ticker.item(l).find('3 MONTH') >= 0:
        x = list(metalTicks[:,0]).index(ticker[l])
        longTicker[0,l]  = metalTicks[x,3]
        if not longTicker[0,l] in list(tickerList[1,:]):
            tickerList[0,tickerListnum] = tickerListnum + 1
            tickerList[1,tickerListnum] = longTicker[0,l] 
            tickerList[2,tickerListnum] = 4
            tickerListnum += 1

        derivativeType[l] = 4
        tickerType[l] = 'Future'

    if ticker.item(l).find('CURNCY') >= 0:
        if ticker.item(l).find('KRWUSD CURNCY'):
            prices[l] = 1/float(prices.item(l))

        longTicker[0,l]  = ticker[l,0]
        if not longTicker[0,l] in list(tickerList[1,:]):
            tickerList[0,tickerListnum] = tickerListnum + 1
            tickerList[1,tickerListnum] = longTicker[0,l] 
            tickerList[2,tickerListnum] = 2
            tickerListnum += 1

        derivativeType[l] = 2
        tickerType[l] = 'FX'    

    if ticker.item(l).find('_') >= 0:
        x = ticker[l] == sasTick
        longTicker[0,l]  = bbgTick[x]
        if not longTicker[0,l] in list(tickerList[1,:]):
            tickerList[0,tickerListnum] = tickerListnum + 1
            tickerList[1,tickerListnum] = longTicker[0,l] 
            tickerList[2,tickerListnum] = 3
            tickerListnum += 1

        derivativeType[l] = 3
        tickerType[l] = 'Option'

    # need convert ticker thing    

    if not longTicker[0,l] in list(tickerList[1,:]):
            tickerList[0,tickerListnum] = tickerListnum + 1
            tickerList[1,tickerListnum] = longTicker[0,l] 
            tickerList[2,tickerListnum] = 1
            tickerListnum += 1
Run Code Online (Sandbox Code Playgroud)

MATLAB代码:

longTicker = cell(size(ticker));
genericTicker = cell(size(ticker));
type = repmat({'Future'},size(ticker));
tickerList = repmat([cell(1);cell(1);{1}],1,9999);
%tickerList = cell(3,9999);
tickerListnum = 0;
modelList = cell(2,9999);
modelListnum = 0;
derivativeType = ones(size(ticker));

for j=1:length(ticker)

    if isempty(find(strcmp(modCode{j},modelList(2,:)), 1))
        modelListnum = modelListnum+1;
        modelList{1,modelListnum}= modelListnum;
        modelList(2,modelListnum)= modCode(j);
    end

    if ~isempty(strfind(ticker{j},'3 MONTH'))
        x =strcmp(ticker{j},metalTicks(:,1));
        longTicker{j} = metalTicks{x,4};
        % genericTicker{j} = metalTicks{x,4};
        if isempty(find(strcmp(longTicker(j),tickerList(2,:)), 1))
        tickerListnum = tickerListnum+1;
        tickerList{1,tickerListnum}= tickerListnum;
        tickerList(2,tickerListnum)=longTicker(j);
        tickerList{3,tickerListnum}=4;
        end
        derivativeType(j) = 4;
        type{j} = 'Future';
        continue;
    end
    if ~isempty(regexp(ticker{j},'[A-Z]{6}\sCURNCY', 'once'))
        if strcmpi('KRWUSD CURNCY',ticker{j})
            prices{j}=1/prices{j};
        end
        longTicker{j} = ticker{j};
        % genericTicker{j} = ticker{j};
        if isempty(find(strcmp(longTicker(j),tickerList(2,:)), 1))
        tickerListnum = tickerListnum+1;
        tickerList{1,tickerListnum}= tickerListnum;
        tickerList(2,tickerListnum)=longTicker(j);
        tickerList{3,tickerListnum}=2;
        end
        derivativeType(j) = 2;
        type{j} = 'FX';
        continue;
    end
    if ~isempty(regexp(ticker{j},'_', 'once'))
        z = strcmp(ticker{j},sasTick);
        try
            longTicker(j) = bbgTick(z);
        catch
            keyboard;  % I did this - Dave
        end
        % genericTicker(j) = bbgTick(z);
        if isempty(find(strcmp(longTicker(j),tickerList(2,:)), 1))
        tickerListnum = tickerListnum+1;
        tickerList{1,tickerListnum}= tickerListnum;
        tickerList(2,tickerListnum)=longTicker(j);
        tickerList{3,tickerListnum}=3;
        end
        derivativeType(j) = 3;
        type{j} = 'Option';
        continue;
    end
    try
        longTicker{j} = ConvertTicker(ticker{j},'short','long',tradeDate(j));
        % genericTicker{j} = ConvertTicker(ticker{j},'short','generic',tradeDate(j));
    catch
        longTicker{j} = ticker{j};
        % genericTicker{j} = ticker{j};
    end
    if isempty(find(strcmp(longTicker(j),tickerList(2,:)), 1))
        tickerListnum = tickerListnum+1;
        tickerList{1,tickerListnum}= tickerListnum;
        tickerList(2,tickerListnum)=longTicker(j);
        tickerList{3,tickerListnum}=1;
    end
end
Run Code Online (Sandbox Code Playgroud)

在这种情况下,MATLAB似乎速度提高了大约100倍.Python中的循环要慢得多吗?

The*_*Cat 6

虽然我无法确定减速的主要来源是什么,但我注意到一些会导致速度减慢的事情,很容易修复,并且会产生更清晰的代码:

  1. 你做了很多从numpy数组到列表的转换.类型转换很昂贵,尽量避免使用它们.在你的情况下,你很少从numpy中受益.在几乎所有情况下,最好只使用列表代替一维数组或列表列表代替二维数组.这更接近MATLAB中的单元阵列,除了它们可以动态调整大小并具有良好的性能.唯一可能的例外是sastick,bbgtickprices,后两种方法都可以正常工作.对于其他人,如果您只是递增地放置值,只需创建空列表并使用append,以及需要访问预分配None或空字符串的任意元素的情况''.因为tickerList它可能更容易有两个列表.
  2. 您为unicode数组分配了很多整数.这还涉及类型转换(整数到unicode).如果使用列表,这也不会成为问题.
  3. 你经常使用foo.item(l).这会将numpy元素转换为普通的python数据类型.同样,这是一种类型转换,所以如果可以避免,请不要这样做.如果您遵循我的建议1并使用列表,则无需在当前代码中执行此操作.
  4. continue在MATLAB版本中有语句,但在python版本中没有,这意味着您在MATLAB版本中跳过的Python版本中进行计算.我认为你会更好if..elseif,但continue也适用于Python.
  5. 你循环range(0,len(ticker)),然后多次提取自动收报机的元素.ticker例如,通过执行操作,您最好直接循环for i, iticker in enumerate(ticker):.使用enumerate允许您还可以跟踪索引.
  6. find用来确定子字符串是否在给定的字符串中.它更快,更清晰,更简单in.只有find在您确切关注子字符串的位置时才使用,而不是.
  7. 对于两者modelListnumtickerListnum,您添加一个,将值分配给数组元素,然后添加一个并将其分配回自身,执行相同的操作两次.在MATLAB版本中,首先递增,然后分配已经递增的版本.这涉及在Python中使用与在MATLAB中相同的数学运算次数.
  8. tickerType像在MATLAB中一样预先分配给'Future' 更快,你可以通过使用类似的东西来做tickerType = ['Future']*len(ticker).
  9. 因为tickerListnum并且modelListnum总是等于索引,所以根本没有理由.摆脱它们.
  10. 由于第一行中每个值只有一个实例tickerList,因此使用它会更快更容易OrderedDict,dict如果你不关心顺序,那么键是longTicker值,值是类型数.
  11. 如果你不关心顺序modelList,使用a set会更快.

因此,这里是应该更快,假设一个版本metalTicks,并且tickerList是列表的列表,sasTick是一个numpy的数组,pricesbbgTick要么是列表或数组,并假设你关心的奥德modelListtickerList:

from collections import OrderedDict

longTicker = [None]*len(ticker)
tickerType = ['Future']*len(ticker)
tickerList = OrderedDict()
modelList = []
derivativeType = np.ones_like(ticker)

for i, (iticker, imodCode)  in enumerate(zip(ticker, modCode)):
    if imodCode not in modelList:
        modelList.append(imodCode)

    if '3 MONTH' in iticker:
        x = metalTicks[0].index(iticker)
        longTicker[i] = metalTicks[3][x]
        derivativeType[i] = 4

    elif 'CURNCY' in iticker:
        if 'KRWUSD CURNCY' in iticker:
            prices[i] = 1/prices[i]

        longTicker[i]  = iticker
        derivativeType[i] = 2
        tickerType[i] = 'FX'    

    elif '_' in iticker:
        longTicker[i]  = bbgTick[iticker == sasTick]
        derivativeType[i] = 3
        tickerType[i] = 'Option'

    tickerList[longTicker[i]] = derivativeType[i]
Run Code Online (Sandbox Code Playgroud)

如果你不关心modelList和的顺序tickerList,你可以这样做:

longTicker = [None]*len(ticker)
tickerType = ['Future']*len(ticker)
tickerList = {}
modelList = set()
derivativeType = np.ones_like(ticker)

for i, (iticker, imodCode)  in enumerate(zip(ticker, modCode)):
    modelList.add(imodCode)

    if '3 MONTH' in iticker:
        x = metalTicks[0].index(iticker)
        longTicker[i] = metalTicks[3][x]
        derivativeType[i] = 4

    elif 'CURNCY' in iticker:
        if 'KRWUSD CURNCY' in iticker:
            prices[i] = 1/prices[i]

        longTicker[i]  = iticker
        derivativeType[i] = 2
        tickerType[i] = 'FX'    

    elif '_' in iticker:
        longTicker[i]  = bbgTick[iticker == sasTick]
        derivativeType[i] = 3
        tickerType[i] = 'Option'

    tickerList[longTicker[i]] = derivativeType[i]
Run Code Online (Sandbox Code Playgroud)

或者更简单:

longTicker = [None]*len(ticker)
tickerType = ['Future']*len(ticker)
derivativeType = np.ones_like(ticker)

for i, iticker in enumerate(ticker):
    if '3 MONTH' in iticker:
        x = metalTicks[0].index(iticker)
        longTicker[i] = metalTicks[3][x]
        derivativeType[i] = 4

    elif 'CURNCY' in iticker:
        if 'KRWUSD CURNCY' in iticker:
            prices[i] = 1/prices[i]

        longTicker[i]  = iticker
        derivativeType[i] = 2
        tickerType[i] = 'FX'

    elif '_' in iticker:
        longTicker[i]  = bbgTick[iticker == sasTick]
        derivativeType[i] = 3
        tickerType[i] = 'Option'

modelList = set(modCode)
tickerlist = dict(zip(longTicker, derivativeType))
Run Code Online (Sandbox Code Playgroud)

  • 要记住的重要一点是,当您不应该用Python编写MATLAB代码时。尽管通常可以将MATLAB代码直接转换为Python,但是高效,编写良好的MATLAB代码通常不会转换为高效,编写良好的Python。最好弄清楚要完成的任务,然后弄清楚如何以Python的方式完成。 (2认同)