沿numpy数组应用函数

Mel*_*art 8 python numpy

我有以下numpy ndarray.

[ -0.54761371  17.04850603   4.86054302]
Run Code Online (Sandbox Code Playgroud)

我想将此函数应用于数组的所有元素

def sigmoid(x):
  return 1 / (1 + math.exp(-x))

probabilities = np.apply_along_axis(sigmoid, -1, scores)
Run Code Online (Sandbox Code Playgroud)

这是我得到的错误.

TypeError: only length-1 arrays can be converted to Python scalars
Run Code Online (Sandbox Code Playgroud)

我究竟做错了什么.

Ser*_*ity 11

功能不利numpy.apply_along_axis于此目的.尝试使用numpy.vectorize矢量化函数:https://docs.scipy.org/doc/numpy/reference/generated/numpy.vectorize.html 此函数定义一个矢量化函数,它将嵌套的对象序列或numpy数组作为输入和返回numpy数组的单个或元组作为输出.

import numpy as np
import math

# custom function
def sigmoid(x):
  return 1 / (1 + math.exp(-x))

# define vectorized sigmoid
sigmoid_v = np.vectorize(sigmoid)

# test
scores = np.array([ -0.54761371,  17.04850603,   4.86054302])
print sigmoid_v(scores)
Run Code Online (Sandbox Code Playgroud)

输出: [ 0.36641822 0.99999996 0.99231327]

性能测试表明,这scipy.special.expit是计算逻辑函数和矢量化变量的最佳解决方案,最糟糕的是:

import numpy as np
import math
import timeit

def sigmoid_(x):
  return 1 / (1 + math.exp(-x))
sigmoidv = np.vectorize(sigmoid_)

def sigmoid(x):
   return 1 / (1 + np.exp(x))

print timeit.timeit("sigmoidv(scores)", "from __main__ import sigmoidv, np; scores = np.random.randn(100)", number=25),\
timeit.timeit("sigmoid(scores)", "from __main__ import sigmoid, np; scores = np.random.randn(100)",  number=25),\
timeit.timeit("expit(scores)", "from scipy.special import expit; import numpy as np;   scores = np.random.randn(100)",  number=25)

print timeit.timeit("sigmoidv(scores)", "from __main__ import sigmoidv, np; scores = np.random.randn(1000)", number=25),\
timeit.timeit("sigmoid(scores)", "from __main__ import sigmoid, np; scores = np.random.randn(1000)",  number=25),\
timeit.timeit("expit(scores)", "from scipy.special import expit; import numpy as np;   scores = np.random.randn(1000)",  number=25)

print timeit.timeit("sigmoidv(scores)", "from __main__ import sigmoidv, np; scores = np.random.randn(10000)", number=25),\
timeit.timeit("sigmoid(scores)", "from __main__ import sigmoid, np; scores = np.random.randn(10000)",  number=25),\
timeit.timeit("expit(scores)", "from scipy.special import expit; import numpy as np;   scores = np.random.randn(10000)",  number=25)
Run Code Online (Sandbox Code Playgroud)

结果:

size        vectorized      numpy                 expit
N=100:   0.00179314613342 0.000460863113403 0.000132083892822
N=1000:  0.0122890472412  0.00084114074707  0.000464916229248
N=10000: 0.109477043152   0.00530695915222  0.00424313545227
Run Code Online (Sandbox Code Playgroud)

  • 值得注意的是:“提供 vectorize 函数主要是为了方便,而不是为了性能。实现本质上是一个 for 循环。” (3认同)

jua*_*aga 10

使用np.expand 将以矢量化方式处理 numpy 数组:

>>> def sigmoid(x):
...     return 1 / (1 + np.exp(-x))
...
>>> sigmoid(scores)
array([  6.33581776e-01,   3.94391811e-08,   7.68673281e-03])
>>>
Run Code Online (Sandbox Code Playgroud)

您可能不会比这更快。考虑:

>>> def sigmoid(x):
...     return 1 / (1 + np.exp(-x))
...
Run Code Online (Sandbox Code Playgroud)

和:

>>> def sigmoidv(x):
...   return 1 / (1 + math.exp(-x))
...
>>> vsigmoid = np.vectorize(sigmoidv)
Run Code Online (Sandbox Code Playgroud)

现在,比较时间。使用小(大小 100)数组:

>>> t = timeit.timeit("vsigmoid(arr)", "from __main__ import vsigmoid, np; arr = np.random.randn(100)", number=100)
>>> t
0.006894525984534994
>>> t = timeit.timeit("sigmoid(arr)", "from __main__ import sigmoid, np; arr = np.random.randn(100)", number=100)
>>> t
0.0007238480029627681
Run Code Online (Sandbox Code Playgroud)

因此,与小阵列仍然存在数量级差异。这种性能差异保持相对恒定,数组大小为 10,000:

>>> t = timeit.timeit("vsigmoid(arr)", "from __main__ import vsigmoid, np; arr = np.random.randn(10000)", number=100)
>>> t
0.3823414359940216
>>> t = timeit.timeit("sigmoid(arr)", "from __main__ import sigmoid, np; arr = np.random.randn(10000)", number=100)
>>> t
0.011259705002885312
Run Code Online (Sandbox Code Playgroud)

最后是一个大小为 100,000 的数组:

>>> t = timeit.timeit("vsigmoid(arr)", "from __main__ import vsigmoid, np; arr = np.random.randn(100000)", number=100)
>>> t
3.7680041620042175
>>> t = timeit.timeit("sigmoid(arr)", "from __main__ import sigmoid, np; arr = np.random.randn(100000)", number=100)
>>> t
0.09544878199812956
Run Code Online (Sandbox Code Playgroud)