我的贝叶斯优化自定义距离在 scipy.spatial.distance.cdist 函数中运行非常慢

Bio*_*wav 5 python optimization performance numpy scipy

您好,我正在尝试使用 bayesian_optimization和自定义内核函数(具体来说是使用 kendall 距离的RBF版本)来执行贝叶斯优化。

我试图将kendall distance作为参数传递给位于 scipy.spatial.distance 中的cdistpdist函数。我正在重用scipy.stats.kendalltau中的 kendalltau 函数的代码,具体来说,我的 kendall 距离定义如下:

def kendall_distance(x,y):
    perm = np.argsort(y)  # sort on y and convert y to dense ranks
    x, y = x[perm], y[perm]
    y = np.r_[True, y[1:] != y[:-1]].cumsum(dtype=np.intp)

    # stable sort on x and convert x to dense ranks
    perm = np.argsort(x, kind='mergesort')
    x, y = x[perm], y[perm]
    x = np.r_[True, x[1:] != x[:-1]].cumsum(dtype=np.intp)

    dis = _kendall_dis(x, y)  # discordant pairs
    return dis
Run Code Online (Sandbox Code Playgroud)

通过这个距离,我定义了我的自定义内核函数,

class PermutationRBF(StationaryKernelMixin, NormalizedKernelMixin, Kernel):
    def __init__(self, alpha=1.0, alpha_bounds=(1e-5, 1e5)):
        self.alpha = alpha
        self.alpha_bounds = alpha_bounds

    @property
    def anisotropic(self):
        return np.iterable(self.alpha) and len(self.alpha) > 1

    @property
    def hyperparameter_length_scale(self):
        if self.anisotropic:
            return Hyperparameter("length_scale", "numeric",
                                  self.alpha_bounds,
                                  len(self.alpha))
        return Hyperparameter(
            "alpha", "numeric", self.alpha_bounds)

    def __call__(self, X, Y=None, eval_gradient=False):
        X = np.atleast_2d(X)
        alpha = _check_length_scale(X, self.alpha)
        if Y is None:
            dists = pdist(X / alpha, kendall_distance) # First change
            K = np.exp(-.5 * dists)
            # convert from upper-triangular matrix to square matrix
            K = squareform(K)
            np.fill_diagonal(K, 1)
        else:
            if eval_gradient:
                raise ValueError(
                    "Gradient can only be evaluated when Y is None.")
            dists = cdist(X / alpha, Y / alpha, kendall_distance) # Second change
            K = np.exp(-.5 * dists)
        if eval_gradient:
            if self.hyperparameter_length_scale.fixed:
                # Hyperparameter l kept fixed
                return K, np.empty((X.shape[0], X.shape[0], 0))
            elif not self.anisotropic or alpha.shape[0] == 1:
                K_gradient = \
                    (K * squareform(dists))[:, :, np.newaxis]
                return K, K_gradient
            elif self.anisotropic:
                # We need to recompute the pairwise dimension-wise distances
                K_gradient = (X[:, np.newaxis, :] - X[np.newaxis, :, :]) ** 2 \
                    / (alpha ** 2)
                K_gradient *= K[..., np.newaxis]
                return K, K_gradient
        else:
            return K
Run Code Online (Sandbox Code Playgroud)

与原始版本(用注释标记)相比,唯一的变化是将 kendall_distance 函数作为参数传递给 cdist 和 pdist 函数。

当我使用该内核运行优化时,问题出现了,与 RBF 相比,性能非常慢。肯德尔距离 O(n log n) 的计算比欧几里得 O(n) 更难,但是,在小尺寸中,差异应该不会那么明显。

定制内核

iteration:  0
time:  0.0008704662322998047
iteration:  1
time:  1.2141697406768799
iteration:  2
time:  2.3469510078430176
iteration:  3
time:  3.5015127658843994
iteration:  4
time:  4.695566892623901
Run Code Online (Sandbox Code Playgroud)

RBF核


iteration:  0
time:  0.000415802001953125
iteration:  1
time:  0.020179033279418945
iteration:  2
time:  0.033345937728881836
iteration:  3
time:  0.033483028411865234
iteration:  4
time:  0.0286252498626709
Run Code Online (Sandbox Code Playgroud)

您可以在此处查看完整的笔记本和结果。

我认为问题是因为如果我调用具有一定积分距离的 pdist 和 cdist 函数(例如欧几里得距离),这些函数将运行以下代码

    elif isinstance(metric, str):
        mstr = metric.lower()

        mstr, kwargs = _select_weighted_metric(mstr, kwargs, out)

        metric_name = _METRIC_ALIAS.get(mstr, None)

        if metric_name is not None:
            X, typ, kwargs = _validate_pdist_input(X, m, n,
                                                   metric_name, **kwargs)

            if 'w' in kwargs:
                metric_name = _C_WEIGHTED_METRICS.get(metric_name, metric_name)

            # get pdist wrapper
            pdist_fn = getattr(_distance_wrap,
                               "pdist_%s_%s_wrap" % (metric_name, typ))
            pdist_fn(X, dm, **kwargs)
            return dm
Run Code Online (Sandbox Code Playgroud)

我想将在 C 中进行编程和优化。另一方面,如果我使用自定义指标,则执行的代码片段是

    if callable(metric):
        mstr = getattr(metric, '__name__', 'UnknownCustomMetric')
        metric_name = _METRIC_ALIAS.get(mstr, None)

        if metric_name is not None:
            X, typ, kwargs = _validate_pdist_input(X, m, n,
                                                   metric_name, **kwargs)

        k = 0
        for i in range(0, m - 1):
            for j in range(i + 1, m):
                dm[k] = metric(X[i], X[j], **kwargs)
                k = k + 1
Run Code Online (Sandbox Code Playgroud)

这是纯Python,默认情况下速度较慢。

难道这就是瓶颈的原因吗?如果是这样,我需要提高执行效率,有什么方法可以改进吗?

我在这里再次留下测试代码Complete Notebook

使用新信息进行编辑:

正如评论所建议的,我使用 line_profiler 来找到瓶颈,这些是获得的简化结果:

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
.
.
.
58         1        879.0    879.0     92.4          dm = calculate_pdist_dm(metric,dm,m,X)
.
.
.
Run Code Online (Sandbox Code Playgroud)

其中calculate_pdist_dm如下

def calculate_pdist_dm(metric,dm,m,X):
    k = 0
    for i in range(0, m - 1):
        for j in range(i + 1, m):
            dm[k] = metric(X[i], X[j])
            k = k + 1
    return dm
Run Code Online (Sandbox Code Playgroud)

度量函数是前面注释的肯德尔距离,时间结果如下。

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
     2                                           @do_profile()
     3                                           def kendall_distance(x,y):
     4         1         11.0     11.0      6.9      perm = np.argsort(y)  # sort on y and convert y to dense ranks
     5         1          1.0      1.0      0.6      x, y = x[perm], y[perm]
     6         1         59.0     59.0     37.1      y = np.r_[True, y[1:] != y[:-1]].cumsum(dtype=np.intp)
     7                                           
     8                                               # stable sort on x and convert x to dense ranks
     9         1          9.0      9.0      5.7      perm = np.argsort(x, kind='mergesort')
    10         1          1.0      1.0      0.6      x, y = x[perm], y[perm]
    11         1         50.0     50.0     31.4      x = np.r_[True, x[1:] != x[:-1]].cumsum(dtype=np.intp)
    12                                           
    13         1         28.0     28.0     17.6      dis = _kendall_dis(x, y)  # discordant pairs
    14         1          0.0      0.0      0.0      return dis

Timer unit: 1e-06 s

Total time: 0.000159 s
Run Code Online (Sandbox Code Playgroud)

另外,如果我使用纯 Python 实现 Kendall 距离,

def kendall_distance(x,y):
    distance = 0
    for i in range(len(x)):
        for j in range(i,len(x)):
            a = x[i] - x[j]
            b = y[i] - y[j]
            if (a * b < 0):
                distance += 1
    return distance
Run Code Online (Sandbox Code Playgroud)

我可以使用 numba 来更有效地执行,但距离内置距离的效率还很远(下面是更高级迭代 95-99 的结果,其中贝叶斯优化的计算成本很高)。有什么办法可以改善这一点吗?

具有 Kendall 距离和 numba 的自定义内核

iteration:  95
time:  0.3591618537902832
iteration:  96
time:  0.41269588470458984
iteration:  97
time:  0.40320730209350586
iteration:  98
time:  0.40665769577026367
iteration:  99
time:  0.37867259979248047
Run Code Online (Sandbox Code Playgroud)

具有 scipy 内置欧氏距离的 RBF 内核

iteration:  95
time:  0.04483485221862793
iteration:  96
time:  0.046830177307128906
iteration:  97
time:  0.03493475914001465
iteration:  98
time:  0.03614163398742676
iteration:  99
time:  0.042229413986206055
Run Code Online (Sandbox Code Playgroud)

使用 numba 的新代码在这里