子矩阵的最大行列式

Question

子矩阵的最大行列式

Tho*_*ing 11 optimization r matrix determinants cvxr

假设我们有一个方阵M，例如，

set.seed(1)
M <- matrix(rnorm(5*5), 5, 5)

> M
           [,1]       [,2]       [,3]        [,4]        [,5]
[1,] -0.6264538 -0.8204684  1.5117812 -0.04493361  0.91897737
[2,]  0.1836433  0.4874291  0.3898432 -0.01619026  0.78213630
[3,] -0.8356286  0.7383247 -0.6212406  0.94383621  0.07456498
[4,]  1.5952808  0.5757814 -2.2146999  0.82122120 -1.98935170
[5,]  0.3295078 -0.3053884  1.1249309  0.59390132  0.61982575

Run Code Online (Sandbox Code Playgroud)

我想知道是否有一种有效的方法可以找到一个子矩阵，使其行列式是所有子矩阵中的最大值。矩阵的大小应大于1x1但小于或等于5x5。一些子矩阵示例如下

> M[c(1,5),c(2,3)]
           [,1]     [,2]
[1,] -0.8204684 1.511781
[2,] -0.3053884 1.124931

> M[c(1,2,4),c(1,4,5)]
           [,1]        [,2]       [,3]
[1,] -0.6264538 -0.04493361  0.9189774
[2,]  0.1836433 -0.01619026  0.7821363
[3,]  1.5952808  0.82122120 -1.9893517

> M[1:4,2:5]
           [,1]       [,2]        [,3]        [,4]
[1,] -0.8204684  1.5117812 -0.04493361  0.91897737
[2,]  0.4874291  0.3898432 -0.01619026  0.78213630
[3,]  0.7383247 -0.6212406  0.94383621  0.07456498
[4,]  0.5757814 -2.2146999  0.82122120 -1.98935170

Run Code Online (Sandbox Code Playgroud)

我可以用蛮力的方式来完成，即迭代所有可能的子矩阵，但我相信必须有一些优化方法可以让它更容易。

我更喜欢看到解决方案，CVXR但不确定这个优化问题是否可以用凸方式来表述。有人可以帮忙吗？否则，也欢迎其他优化包！

Answer 1

All*_*ron 7

由于已经四天没有答案了，我想我会用一个可行的通用解决方案让球滚动。不幸的是，它属于蛮力类别，尽管对于 5 x 5 矩阵它相当快，在大约 5 毫秒内完成：

max_det <- function(M) {
  if(diff(dim(M)) != 0) stop("max_det requires a square matrix")
  
  s  <- lapply(seq(dim(M)[1])[-1], function(x) combn(seq(dim(M)[1]), x))
  
  all_dets <- lapply(s, function(m) {
    apply(m, 2, function(i) apply(m, 2, function(j) det(M[j, i])))
    })
  
  i <- which.max(sapply(all_dets, max))
  subs <- which(all_dets[[i]] == max(all_dets[[i]]), arr.ind = TRUE)

  sub_M <- M[s[[i]][,subs[1]], s[[i]][,subs[2]]]
  
  list(max_determinant = det(sub_M),
       indices = list(rows = s[[i]][,subs[1]], columns = s[[i]][,subs[2]]),
       submatrix = sub_M)
}

Run Code Online (Sandbox Code Playgroud)

输出的格式是：

max_det(M)
#> $max_determinant
#> [1] 4.674127
#> 
#> $indices
#> $indices$rows
#> [1] 3 4 5
#> 
#> $indices$columns
#> [1] 1 3 4
#> 
#> 
#> $submatrix
#>            [,1]       [,2]      [,3]
#> [1,] -0.8356286 -0.6212406 0.9438362
#> [2,]  1.5952808 -2.2146999 0.8212212
#> [3,]  0.3295078  1.1249309 0.5939013

Run Code Online (Sandbox Code Playgroud)

问题当然是这不能很好地扩展到更大的矩阵。虽然它仍然有效：

set.seed(1)
M <- matrix(rnorm(10 * 10), 10, 10)

#> max_det(M)
#> $max_determinant
#> [1] 284.5647
#> 
#> $indices
#> $indices$rows
#> [1]  1  3  4  5  6  8  9 10
#> 
#> $indices$columns
#> [1]  2  3  4  6  7  8  9 10
#> 
#> 
#> $submatrix
#>             [,1]        [,2]        [,3]       [,4]        [,5]         [,6]
#> [1,]  1.51178117  0.91897737  1.35867955  0.3981059  2.40161776  0.475509529
#> [2,] -0.62124058  0.07456498  0.38767161  0.3411197  0.68973936  0.610726353
#> [3,] -2.21469989 -1.98935170 -0.05380504 -1.1293631  0.02800216 -0.934097632
#> [4,]  1.12493092  0.61982575 -1.37705956  1.4330237 -0.74327321 -1.253633400
#> [5,] -0.04493361 -0.05612874 -0.41499456  1.9803999  0.18879230  0.291446236
#> [6,]  0.94383621 -1.47075238 -0.05931340 -1.0441346  1.46555486  0.001105352
#> [7,]  0.82122120 -0.47815006  1.10002537  0.5697196  0.15325334  0.074341324
#> [8,]  0.59390132  0.41794156  0.76317575 -0.1350546  2.17261167 -0.589520946
#>            [,7]       [,8]
#> [1,] -0.5686687 -0.5425200
#> [2,]  1.1780870  1.1604026
#> [3,] -1.5235668  0.7002136
#> [4,]  0.5939462  1.5868335
#> [5,]  0.3329504  0.5584864
#> [6,] -0.3041839 -0.5732654
#> [7,]  0.3700188 -1.2246126
#> [8,]  0.2670988 -0.4734006

Run Code Online (Sandbox Code Playgroud)

我花了一秒钟的时间来为 10 x 10 矩阵找到这个解决方案。

我认为这个解决方案是O(n!)复杂度，所以即使比 10 x 10 矩阵大一点，你也可以忘记它。我有一种感觉应该有一个O(n³)解决方案，但我的数学不够好，无法弄清楚。

我想这至少为其他人用更复杂的方法击败它提供了一个基准......

Answer 2

Enr*_*ann 5

我采用了艾伦·卡梅伦 (Allan Cameron) 的解决方案，并将其与启发式阈值接受（TA；模拟退火的一种变体）进行了比较。本质上，它从随机子矩阵开始，然后增量地更改该子矩阵，例如通过交换行索引，或者通过添加或删除列。

解决方案将被编码为列表，给出行索引和列索引。因此，对于大小为 5x5 的矩阵，一种候选解决方案可能是

x
## [[1]]
## [1]  TRUE FALSE FALSE  TRUE FALSE
## 
## [[2]]
## [1]  TRUE FALSE  TRUE FALSE FALSE

Run Code Online (Sandbox Code Playgroud)

这样的解决方案通过邻域函数来改变nb。例如：

nb(x)
## [[1]]
## [1]  TRUE FALSE FALSE  TRUE  TRUE
## 
## [[2]]
## [1]  TRUE FALSE  TRUE  TRUE FALSE
##                       ^^^^^

Run Code Online (Sandbox Code Playgroud)

给定这样的解决方案，我们需要一个目标函数。

OF <- function(x, M)
    -det(M[x[[1]], x[[2]], drop = FALSE])

Run Code Online (Sandbox Code Playgroud)

自从实施 TA 以来，我将使用最小值，我在行列式前面加了一个减号。

邻域函数nb可能是这样的（尽管它肯定可以改进）：

nb <- function(x, ...) {
    if (sum(x[[1L]]) > 0L &&
        sum(x[[1L]]) < length(x[[1L]]) &&
        runif(1) > 0.5) {
        rc <- if (runif(1) > 0.5)
                  1 else 2
        select1 <- which( x[[rc]])
        select2 <- which(!x[[rc]])
        size <- min(length(select1), length(select2))
        size <- sample.int(size, 1)
        i <- select1[sample.int(length(select1), size)]
        j <- select2[sample.int(length(select2), size)]
        x[[rc]][i] <- !x[[rc]][i]
        x[[rc]][j] <- !x[[rc]][j]        
    } else {            
        i <- sample.int(length(x[[1L]]), 1)
        if (x[[1L]][i]) {
            select <- which( x[[2L]])
        } else {
            select <- which(!x[[2L]])
        }
        j <- select[sample.int(length(select), 1)]
        x[[1L]][i] <- !x[[1L]][i]
        x[[2L]][j] <- !x[[2L]][j]
    }
    x
}

Run Code Online (Sandbox Code Playgroud)

本质上，nb抛一枚硬币，然后重新排列行或列索引（即保持子矩阵的大小不变），或者添加或删除行和列。

最后，我创建一个辅助函数来创建随机初始解决方案。

x0 <- function() {
    k <- sample(n, 1)
    x1 <- logical(n)
    x1[sample(n, k)] <- TRUE
    x2 <- sample(x1)
    list(x1, x2)
}

Run Code Online (Sandbox Code Playgroud)

我们可以运行阈值接受。我使用软件包（我维护的）TAopt中提供的名为的实现。NMOF为了获得良好的风格，我重新启动 10 次并保持最佳结果。

n <- 5
M <- matrix(rnorm(n*n), n, n)
max_det(M)$indices
## $rows
## [1] 1 2 4
## 
## $columns
## [1] 2 3 5

library("NMOF")
restartOpt(TAopt, 10, OF,
           list(x0 = x0,
                neighbour = nb,
                printBar = FALSE,
                printDetail = FALSE,
                q = 0.9,
                nI = 1000, drop0 = TRUE),
           M = M, best.only = TRUE)$xbest
## [[1]]
## [1]  TRUE  TRUE FALSE  TRUE FALSE
## 
## [[2]]
## [1] FALSE  TRUE  TRUE FALSE  TRUE

Run Code Online (Sandbox Code Playgroud)

所以我们得到相同的行/列。我进行了以下小实验，将M, 的大小从 2 增加到 20。每次我将 TA 的解决方案与最佳解决方案进行比较，并且还记录 TA 和完整枚举所需的时间（以秒为单位）。

set.seed(134345)
message(format(c("Size",
        "Optimum",
        "TA",
        "Time optimum",
        "Time TA"), width = 13, justify = "right"))
for (i in 2:20) {
    n <- i
    M <- matrix(rnorm(n*n), n, n)
    t.opt <- system.time(opt <- max_det(M)$max_determinant)
    t.ta <- system.time(ta <- -restartOpt(TAopt, 10, OF,
                                    list(x0 = x0,
                                         neighbour = nb,
                                         printBar = FALSE,
                                         printDetail = FALSE,
                                         q = 0.9,
                                         nI = 1000, drop0 = TRUE),
                                    M = M, best.only = TRUE)$OFvalue)

    message(format(i, width = 13),
            format(round(opt, 2),  width = 13),
            format(round(ta, 2),  width = 13),
            format(round(t.opt[[3]],1), width = 13),
            format(round(t.ta[[3]],1), width = 13))
}

Run Code Online (Sandbox Code Playgroud)

结果：

     Size      Optimum           TA Time optimum      Time TA
        2           NA         1.22            0          0.7
        3         1.46         1.46            0          0.6
        4         2.33         2.33            0          0.7
        5        11.75        11.75            0          0.7
        6         9.33         9.33            0          0.7
        7          9.7          9.7            0          0.7
        8       126.38       126.38          0.1          0.7
        9         87.5         87.5          0.3          0.7
       10       198.63       198.63          1.3          0.7
       11      1019.23      1019.23          5.1          0.7
       12     34753.64     34753.64           20          0.7
       13     16122.22     16122.22         80.2          0.7
       14     168943.9     168943.9        325.3          0.7
       15     274669.6     274669.6       1320.8          0.7
       16      5210298      5210298       5215.4          0.7

Run Code Online (Sandbox Code Playgroud)

因此，至少在大小为 16x16 之前，两种方法都会返回相同的结果。但TA需要小于一秒的恒定时间（迭代次数固定为1000）。

归档时间：	5 年，4 月前
查看次数：	432 次
最近记录：	5 年，4 月前