将上三角矩阵转换为对称矩阵的快速方法

Question

将上三角矩阵转换为对称矩阵的快速方法

我有一个np.float64值的上三角矩阵，如下所示：

array([[ 1.,  2.,  3.,  4.],
       [ 0.,  5.,  6.,  7.],
       [ 0.,  0.,  8.,  9.],
       [ 0.,  0.,  0., 10.]])

Run Code Online (Sandbox Code Playgroud)

我想将其转换为相应的对称矩阵，如下所示：

array([[ 1.,  2.,  3.,  4.],
       [ 2.,  5.,  6.,  7.],
       [ 3.,  6.,  8.,  9.],
       [ 4.,  7.,  9., 10.]])

Run Code Online (Sandbox Code Playgroud)

转换可以就地完成，也可以作为新矩阵进行。我希望它尽快。我如何快速做到这一点？

Answer 1

Pau*_*zer 6

np.where在异地、无缓存的情况下看起来相当快：

np.where(ut,ut,ut.T)

Run Code Online (Sandbox Code Playgroud)

在我的笔记本电脑上：

timeit(lambda:np.where(ut,ut,ut.T))
# 1.909718865994364

Run Code Online (Sandbox Code Playgroud)

如果您安装了 pythran，您可以几乎零努力地将速度提高 3 倍。但请注意，据我所知 pythran （当前）仅理解连续数组。

文件<upp2sym.py>，编译为pythran -O3 upp2sym.py

import numpy as np

#pythran export upp2sym(float[:,:])

def upp2sym(a):
    return np.where(a,a,a.T)

Run Code Online (Sandbox Code Playgroud)

定时：

from upp2sym import *

timeit(lambda:upp2sym(ut))
# 0.5760842661838979

Run Code Online (Sandbox Code Playgroud)

这几乎和循环一样快：

#pythran export upp2sym_loop(float[:,:])

def upp2sym_loop(a):
    out = np.empty_like(a)
    for i in range(len(a)):
        out[i,i] = a[i,i]
        for j in range(i):
            out[i,j] = out[j,i] = a[j,i]
    return out

Run Code Online (Sandbox Code Playgroud)

定时：

timeit(lambda:upp2sym_loop(ut))
# 0.4794591029640287

Run Code Online (Sandbox Code Playgroud)

我们也可以就地进行：

#pythran export upp2sym_inplace(float[:,:])

def upp2sym_inplace(a):
    for i in range(len(a)):
        for j in range(i):
            a[i,j] = a[j,i]

Run Code Online (Sandbox Code Playgroud)

定时

timeit(lambda:upp2sym_inplace(ut))
# 0.28711927914991975

Run Code Online (Sandbox Code Playgroud)

Answer 2

Ker*_*ley 4

这是迄今为止我发现的最快的例程，不使用 Cython 或 Numba 之类的 JIT。我在机器上大约需要 1.6 \xce\xbcs 来处理 4x4 数组（100K 4x4 数组列表的平均时间）：

\n\n

inds_cache = {}\n\ndef upper_triangular_to_symmetric(ut):\n    n = ut.shape[0]\n    try:\n        inds = inds_cache[n]\n    except KeyError:\n        inds = np.tri(n, k=-1, dtype=np.bool)\n        inds_cache[n] = inds\n    ut[inds] = ut.T[inds]\n

Run Code Online (Sandbox Code Playgroud)\n\n

以下是我尝试过的其他一些方法，但速度没有那么快：

\n\n

上面的代码，但是没有缓存。每个 4x4 阵列大约需要 8.3 \xce\xbcs：

\n\n

def upper_triangular_to_symmetric(ut):\n    n = ut.shape[0]\n    inds = np.tri(n, k=-1, dtype=np.bool)\n    ut[inds] = ut.T[inds]\n

Run Code Online (Sandbox Code Playgroud)\n\n

一个普通的 Python 嵌套循环。每个 4x4 阵列大约需要 2.5 \xce\xbcs：

\n\n

def upper_triangular_to_symmetric(ut):\n    n = ut.shape[0]\n    for r in range(1, n):\n        for c in range(r):\n            ut[r, c] = ut[c, r]\n

Run Code Online (Sandbox Code Playgroud)\n\n

浮点加法使用np.triu. 每个 4x4 阵列大约需要 11.9 \xce\xbcs：

\n\n

def upper_triangular_to_symmetric(ut):\n    ut += np.triu(ut, k=1).T\n

Run Code Online (Sandbox Code Playgroud)\n\n

Python 嵌套循环的 Numba 版本。这是我发现的最快的东西（每个 4x4 数组大约 0.4 \xce\xbcs），也是我最终在生产中使用的东西，至少在我开始遇到 Numba 问题并且不得不恢复到纯 Python 版本之前是这样：

\n\n

import numba\n\n@numba.njit()\ndef upper_triangular_to_symmetric(ut):\n    n = ut.shape[0]\n    for r in range(1, n):\n        for c in range(r):\n            ut[r, c] = ut[c, r]\n

Run Code Online (Sandbox Code Playgroud)\n\n

Python 嵌套循环的 Cython 版本。我是 Cython 的新手，所以这可能没有完全优化。由于 Cython 增加了运营开销，我有兴趣听到 Cython 和纯 Numpy 的答案。每个 4x4 阵列大约需要 0.6 \xce\xbcs：

\n\n

cimport numpy as np\ncimport cython\n\n@cython.boundscheck(False)\n@cython.wraparound(False)\ndef upper_triangular_to_symmetric(np.ndarray[np.float64_t, ndim=2] ut):\n    cdef int n, r, c\n    n = ut.shape[0]\n    for r in range(1, n):\n        for c in range(r):\n            ut[r, c] = ut[c, r]\n

Run Code Online (Sandbox Code Playgroud)\n

@MarkDickinson 这也很慢（~5.7 μs）。问题是您正在执行浮点运算（加法和乘法），这比仅仅复制数据要慢得多。 (2认同)

归档时间：	6 年，2 月前
查看次数：	171 次
最近记录：	6 年，2 月前