我有一个矩阵,X我正在计算中间矩阵乘积的加权和.这是一个可重复性最小的例子:
import numpy as np
random_state = np.random.RandomState(1)
n = 5
p = 10
X = random_state.rand(p, n) # 10x5
X_sum = np.zeros((n, n)) # 5x5
# The length of weights are not related to X's dims,
# but will always be smaller
y = 3
weights = random_state.rand(y)
for k in range(y):
X_sum += np.dot(X.T[:, k + 1:],
X[:p - (k + 1), :]) * weights[k]
Run Code Online (Sandbox Code Playgroud)
这很好,并产生我期望的结果.然而,随着重复计算矩阵产品的尺寸n和y增长(数百个),这变得非常昂贵......
然而,产品的计算方式有一个明显的模式:
您可以看到迭代进度,起始列切片Xt向右移动,而结束行X向上移动.这是第N次迭代的样子:
这实际上意味着相同值的子集被重复相乘(参见编辑2),在我看来可能有机会利用...(即,如果我在一次通过中手动计算产品).
但是我希望不必手动做任何事情,并且可能有一种很好的方法来使用Numpy更优雅地实现整个循环.
一组现实的数字:
n = 400
p = 2000
y = 750
Run Code Online (Sandbox Code Playgroud)
要发表评论:
你能解释一下重复倍增的值吗?
考虑以下数组:
n = p = 5
X = np.arange(25).reshape(p, n)
Run Code Online (Sandbox Code Playgroud)
因为k=0,第一个产品将介于A和之间B:
k = 0
A = X.T[:, k + 1:]
B = X[:p - (k + 1), :]
>>> A
array([[ 5, 10, 15, 20],
[ 6, 11, 16, 21],
[ 7, 12, 17, 22],
[ 8, 13, 18, 23],
[ 9, 14, 19, 24]])
>>> B
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19]])
Run Code Online (Sandbox Code Playgroud)
当时k=1:
k = 1
>>> A
array([[10, 15, 20],
[11, 16, 21],
[12, 17, 22],
[13, 18, 23],
[14, 19, 24]])
>>> B
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14]])
Run Code Online (Sandbox Code Playgroud)
因此,如果有意义的话,每个后续矩阵产品都是先前产品的子集.
太长了;我会选择 @Parfaittest_gen_sum基于n、p和的各种值的基准测试来使用y。为了连续性,保留旧答案。
n, p,如何y影响算法的选择此分析是使用 @Parfait 的函数作为确定是否确实存在一个最佳解决方案或是否存在基于n、p和 的值的一系列解决方案的方法来完成的y。
import numpy as np
import pytest # This code also requires the pytest-benchmark plugin
def test_for_sum(n, p, y):
random_state = np.random.RandomState(1)
X = random_state.rand(p, n)
X_sum = np.zeros((n, n))
# The length of weights are not related to X's dims,
# but will always be smaller
weights = random_state.rand(y)
for k in range(y):
X_sum += np.dot(X.T[:, k + 1:],
X[:p - (k + 1), :]) * weights[k]
return X_sum
def test_list_sum(n, p, y):
random_state = np.random.RandomState(1)
X = random_state.rand(p, n)
X_sum = np.zeros((n, n))
# The length of weights are not related to X's dims,
# but will always be smaller
weights = random_state.rand(y)
matrix_list = [np.dot(X.T[:, k + 1:],
X[:p - (k + 1), :]) * weights[k] for k in range(y)]
X_sum = np.sum(matrix_list, axis=0)
return X_sum
def test_reduce_sum(n, p, y):
random_state = np.random.RandomState(1)
X = random_state.rand(p, n)
X_sum = np.zeros((n, n))
# The length of weights are not related to X's dims,
# but will always be smaller
weights = random_state.rand(y)
matrix_list = [(X.T[:, k + 1:] @
X[:p - (k + 1), :]) * weights[k] for k in range(y)]
X_sum = reduce(lambda x,y: x + y, matrix_list)
return X_sum
def test_concat_sum(n, p, y):
random_state = np.random.RandomState(1)
X = random_state.rand(p, n)
X_sum = np.zeros((n, n))
# The length of weights are not related to X's dims,
# but will always be smaller
weights = random_state.rand(y)
x_mat = np.concatenate([np.matmul(X.T[:, k + 1:],
X[:p - (k + 1), :]) for k in range(y)])
wgt_mat = np.concatenate([np.full((n,1), weights[k]) for k in range(y)])
mul_res = x_mat * wgt_mat
X_sum = mul_res.reshape(-1, n, n).sum(axis=0)
return X_sum
def test_matmul_sum(n, p, y):
random_state = np.random.RandomState(1)
X = random_state.rand(p, n)
X_sum = np.zeros((n, n))
# The length of weights are not related to X's dims,
# but will always be smaller
weights = random_state.rand(y)
# Use list comprehension and np.matmul
matrices_list = [np.matmul(X.T[:, k + 1:],
X[:p - (k + 1), :]) * weights[k] for k in range(y)]
# Sum matrices in list of matrices to get the final result
X_sum = np.sum(matrices_list, axis=0)
return X_sum
def test_gen_sum(n, p, y):
random_state = np.random.RandomState(1)
X = random_state.rand(p, n)
X_sum = np.zeros((n, n))
# The length of weights are not related to X's dims,
# but will always be smaller
weights = random_state.rand(y)
matrix_gen = (np.dot(X.T[:, k + 1:],
X[:p - (k + 1), :]) * weights[k] for k in range(y))
X_sum = sum(matrix_gen)
return X_sum
parameters = [
pytest.param(400, 800, 3)
,pytest.param(400, 2000, 3)
,pytest.param(400, 800, 750)
,pytest.param(400, 2000, 750)
]
@pytest.mark.parametrize('n,p,y', parameters)
def test_test_for_sum(benchmark, n, p, y):
benchmark(test_for_sum, n=n, p=p, y=y)
@pytest.mark.parametrize('n,p,y', parameters)
def test_test_list_sum(benchmark, n, p, y):
benchmark(test_list_sum, n=n, p=p, y=y)
@pytest.mark.parametrize('n,p,y', parameters)
def test_test_reduce_sum(benchmark, n, p, y):
benchmark(test_reduce_sum, n=n, p=p, y=y)
@pytest.mark.parametrize('n,p,y', parameters)
def test_test_concat_sum(benchmark, n, p, y):
benchmark(test_concat_sum, n=n, p=p, y=y)
@pytest.mark.parametrize('n,p,y', parameters)
def test_test_matmul_sum(benchmark, n, p, y):
benchmark(test_matmul_sum, n=n, p=p, y=y)
@pytest.mark.parametrize('n,p,y', parameters)
def test_test_gen_sum(benchmark, n, p, y):
benchmark(test_gen_sum, n=n, p=p, y=y)
Run Code Online (Sandbox Code Playgroud)
n=400, p=800, y=3(100 次迭代)
n=400, p=2000, y=3(100 次迭代)
n=400, p=800, y=750(10 次迭代)
n=400, p=2000, y=750(10 次迭代)
y值我肯定会使用np.matmul它来代替它,np.dot这将为您带来最大的性能提升,事实上 的文档np.dot将指导您np.matmul使用 2D 数组乘法来代替np.dot.
我测试了有np.dot和np.matmul没有列表理解的情况,pytest-benchmark结果如下:
顺便说一句,pytest-benchmark非常灵活,我强烈推荐在这样的时候使用它来验证方法是否真正有效。
仅使用列表理解对结果的影响几乎可以忽略不计np.matmul,并且对事物方案的负面影响np.dot(尽管这是更好的形式),但是两种变化的结合产生了最好的结果。我警告说,使用列表理解往往会提高标准。开发人员。的运行时,因此您可能会看到运行时性能的范围比仅使用时更大np.matmul。
这是代码:
import numpy as np
def test_np_matmul_list_comprehension():
random_state = np.random.RandomState(1)
n = p = 1000
X = np.arange(n * n).reshape(p, n)
# The length of weights are not related to X's dims,
# but will always be smaller
y = 3
weights = [1, 1, 1]
# Use list comprehension and np.matmul
matrices_list = [np.matmul(X.T[:, k + 1:],
X[:p - (k + 1), :]) * weights[k] for k in range(y)]
# Sum matrices in list of matrices to get the final result
X_sum = np.sum(matrices_list, axis=0)
Run Code Online (Sandbox Code Playgroud)
y值对于较大的值,y最好不要使用列表理解。在这两种情况下,np.dot平均/中位运行时间往往都较大。np.matmul以下是( , , )pytest-benchmark的结果:n=500p=5000y=750
这可能有点矫枉过正,但我宁愿犯太有帮助的错误:)。
| 归档时间: |
|
| 查看次数: |
356 次 |
| 最近记录: |