dar*_*zig 15 r linear-algebra sparse-matrix matrix-inverse large-data
我有一个方形矩阵,有几万行和一列,只有几1吨0,所以我用这个Matrix包以高效的方式存储在R中.由于base::matrix内存不足,对象无法处理该数量的单元格.
我的问题是我需要逆矩阵以及此类矩阵的Moore-Penrose广义逆,这是我目前无法计算的.
我尝试过的:
solve产生Error in LU.dgC(a) : cs_lu(A) failed: near-singular A (or out of memory)错误MASS::ginv与Matrix班级不相容Matrix为例如bigmemory::big.matrix,而后者也MASS::ginv无论如何也无法工作如果我尝试计算矩阵的Choleski分解以便稍后调用Matrix::chol2inv它,我会收到以下错误消息:
Error in .local(x, ...) :
internal_chm_factor: Cholesky factorization failed
In addition: Warning message:
In .local(x, ...) :
Cholmod warning 'not positive definite' at file ../Cholesky/t_cholmod_rowfac.c, line 431
Run Code Online (Sandbox Code Playgroud)基于相关问题,我还尝试pbdDMAT了单个节点上的包,但pbdDMAT::chol产生了Cholmod error 'out of memory' at file ../Core/cholmod_memory.c, line 147错误消息
问题简而言之:有没有办法计算这种稀疏矩阵的逆和Moore-Penrose广义逆,除了回到matrix在具有大量RAM的计算机上使用类?
快速重现的例子:
library(Matrix)
n <- 1e5
m <- sparseMatrix(1:n, 1:n, x = 1)
m <- do.call(rBind, lapply(1:10, function(i) {
set.seed(i)
m[-sample(1:n, n/3), ]
}))
m <- t(m) %*% m
Run Code Online (Sandbox Code Playgroud)
一些描述(感谢Gabor Grothendieck):
> dim(m)
[1] 100000 100000
> sum(m) / prod(dim(m))
[1] 6.6667e-05
> table(rowSums(m))
0 1 2 3 4 5 6 7 8 9 10
5 28 320 1622 5721 13563 22779 26011 19574 8676 1701
> table(colSums(m))
0 1 2 3 4 5 6 7 8 9 10
5 28 320 1622 5721 13563 22779 26011 19574 8676 1701
Run Code Online (Sandbox Code Playgroud)
还有一些错误信息:
> Matrix::solve(m)
Error in LU.dgC(a) : cs_lu(A) failed: near-singular A (or out of memory)
> base::solve(m)
Error in asMethod(object) :
Cholmod error 'problem too large' at file ../Core/cholmod_dense.c, line 105
> MASS::ginv(m)
Error in MASS::ginv(m) : 'X' must be a numeric or complex matrix
Run Code Online (Sandbox Code Playgroud)
赏金的目标是找到一种计算Moore-Penrose广义逆的方法,m其中RAM小于8Gb(速度和性能并不重要).
如果你只有很少的1,那么你的矩阵在任何一列和任何一行中可能不会超过1,在这种情况下,广义逆等于转置:
library(Matrix)
set.seed(123)
n <- 1e5
m <- sparseMatrix(sample(1:n, n/10), sample(1:n, n/10), x = 1, dims = c(n, n))
##############################################################################
# confirm that m has no more than one 1 in each row and column
##############################################################################
table(rowSums(m)) # here 90,000 rows are all zero, 10,000 have a single one
## 0 1
## 90000 10000
table(colSums(m)) # here 90,000 cols are all zero, 10,000 have a single one
## 0 1
## 90000 10000
##############################################################################
# calculate generalized inverse
##############################################################################
minv <- t(m)
##############################################################################
# check that when multiplied by minv it gives a diagonal matrix of 0's and 1's
##############################################################################
mm <- m %*% minv
table(diag(mm)) # diagonal has 90,000 zeros and 10,000 ones
## 0 1
## 90000 10000
diag(mm) <- 0
range(mm) # off diagonals are all zero
## [1] 0 0
Run Code Online (Sandbox Code Playgroud)
修订改进的演示文稿.
| 归档时间: |
|
| 查看次数: |
2375 次 |
| 最近记录: |