使用lpSolve在整数编程中实现其他约束变量

jas*_*ner 5 r lpsolve

我正在努力实现lpSolve解决方案,以优化假设的日常幻想棒球问题。我在应用最后一个约束时遇到了麻烦:

  • 位置-正好3个外野手(OF)2个投手(P)和其他所有东西
  • 费用-费用少于200
  • 团队-任何一支团队的最大人数为6
  • 团队-名册上的最小团队数为3 **

举例来说,假设您有一个包含1000个球员的数据框,其中包含积分,成本,位置和球队,并且您正在尝试最大化平均积分:

library(tidyverse)
library(lpSolve)
set.seed(123)
df <- data_frame(avg_points = sample(5:45,1000, replace = T),
                 cost = sample(3:45,1000, replace = T),
                 position = sample(c("P","C","1B","2B","3B","SS","OF"),1000, replace = T),
                 team = sample(LETTERS,1000, replace = T)) %>% mutate(id = row_number())
head(df)

# A tibble: 6 x 5
#  avg_points  cost position team     id
#       <int> <int> <chr>    <chr> <int>
#1         17    13 2B       Y         1
#2         39    45 1B       P         2
#3         29    33 1B       C         3
#4         38    31 2B       V         4
#5         17    13 P        A         5
#6         10     6 SS       V         6
Run Code Online (Sandbox Code Playgroud)

我已经使用以下代码实现了前3个约束,但是我在弄清楚如何实现名册上最少团队人数方面遇到了麻烦。我想我需要向模型添加其他变量,但是我不确定该怎么做。

#set the objective function (what we want to maximize)
obj <- df$avg_points 
# set the constraint rows.
con <- rbind(t(model.matrix(~ position + 0,df)), cost = df$cost, t(model.matrix(~ team + 0, df)) )

#set the constraint values
rhs <- c(1,1,1,1,3,2,1,  # 1. #exactly 3 outfielders 2 pitchers and 1 of everything else
         200, # 2. at a cost less than 200
         rep(6,26) # 3. max number from any team is 6
         ) 

#set the direction of the constraints
dir <- c("=","=","=","=","=","=","=","<=",rep("<=",26))

result <- lp("max",obj,con,dir,rhs,all.bin = TRUE)
Run Code Online (Sandbox Code Playgroud)

如果有帮助,我试图复制该文件(有小的调整),其中有相应的朱莉娅代码这里

cle*_*ens 5

这可能是您的问题的解决方案。

这是我使用的数据(与您的数据相同):

library(tidyverse)
library(lpSolve)
N <- 1000

set.seed(123)
df <- tibble(avg_points = sample(5:45,N, replace = T),
             cost = sample(3:45,N, replace = T),
             position = sample(c("P","C","1B","2B","3B","SS","OF"),N, replace = T),
             team = sample(LETTERS,N, replace = T)) %>% 
  mutate(id = row_number())
Run Code Online (Sandbox Code Playgroud)

您想要找到x1...xn最大化以下目标函数的方法:

x1 * average_points1 + x2 * average_points1 + ... + xn * average_pointsn
Run Code Online (Sandbox Code Playgroud)

通过lpSolve的工作方式,您将需要用提供的向量将每一个表示LHS为总和 x1...xn

由于您无法使用当前变量来表示团队数量,因此可以引入新的团队(我将其称为y1..yn_teamsz1..zn_teams):

# number of teams:
n_teams = length(unique(df$team))
Run Code Online (Sandbox Code Playgroud)

新的目标函数(ys和zs不会影响整体目标功能,因为常数设置为0):

obj <- c(df$avg_points, rep(0, 2 * n_teams))
Run Code Online (Sandbox Code Playgroud)

前三个约束是相同的,但是增加了y和的常量z

c1 <- t(model.matrix(~ position + 0,df))
c1 <- cbind(c1, 
            matrix(0, ncol = 2 * n_teams, nrow = nrow(c1)))
c2 = df$cost
c2 <- c(c2, rep(0, 2 * n_teams))
c3 = t(model.matrix(~ team + 0, df))
c3 <- cbind(c3, matrix(0, ncol = 2 * n_teams, nrow = nrow(c3)))
Run Code Online (Sandbox Code Playgroud)

由于您希望至少拥有3个球队,因此您将首先使用它y来计算每个球队的球员人数:

此约束计算每个团队的球员人数。您对选择的球队的所有球员进行汇总,然后减去y每个球队的相应变量。该值应等于0。(diag()创建单位矩阵,我们现在不必担心z):

# should be x1...xn - y1...n = 0
c4_1 <- cbind(t(model.matrix(~team + 0, df)), # x
              -diag(n_teams), # y
              matrix(0, ncol = n_teams, nrow = n_teams) # z
              ) # == 0
Run Code Online (Sandbox Code Playgroud)

由于每个人y现在都是一个团队中的球员人数,因此您现在可以确定z是具有此约束的二进制数:

c4_2 <- cbind(t(model.matrix(~ team + 0, df)), # x1+...+xn ==
              -diag(n_teams), # - (y1+...+yn )
              diag(n_teams) # z binary
              ) # <= 1
Run Code Online (Sandbox Code Playgroud)

这是确保至少选择3个团队的约束:

c4_3 <- c(rep(0, nrow(df) + n_teams), # x and y
          rep(1, n_teams) # z >= 3
          )
Run Code Online (Sandbox Code Playgroud)

您需要确保

式

您可以使用big-M方法创建约束,即:

公式2

或者,在更lpSolve友好的版本中:

公式3

在这种情况下,您可以将其6用作的值M,因为它是任何一个y都可以取的最大值:

c4_4 <- cbind(matrix(0, nrow = n_teams, ncol = nrow(df)),
              diag(n_teams),
              -diag(n_teams) * 6)
Run Code Online (Sandbox Code Playgroud)

添加此约束以确保所有x都是二进制的:

#all x binary
c5 <- cbind(diag(nrow(df)), # x
            matrix(0, ncol = 2 * n_teams, nrow = nrow(df)) # y + z
            )
Run Code Online (Sandbox Code Playgroud)

创建新的约束矩阵

con <- rbind(c1,
             c2,
             c3,
             c4_1,
             c4_2,
             c4_3,
             c4_4,
             c5)

#set the constraint values
rhs <- c(1,1,1,1,3,2,1,  # 1. #exactly 3 outfielders 2 pitchers and 1 of everything else
         200, # 2. at a cost less than 200
         rep(6, n_teams), # 3. max number from any team is 6
         rep(0, n_teams), # c4_1
         rep(1, n_teams), # c4_2
         3, # c4_3,
         rep(0, n_teams), #c4_4
         rep(1, nrow(df))# c5 binary
)

#set the direction of the constraints
dir <- c(rep("==", 7), # c1
         "<=", # c2
         rep("<=", n_teams), # c3
         rep('==', n_teams), # c4_1
         rep('<=', n_teams), # c4_2
         '>=', # c4_3
         rep('<=', n_teams), # c4_4 
         rep('<=', nrow(df)) # c5
         )
Run Code Online (Sandbox Code Playgroud)

问题几乎是相同的,但我使用的all.int不是all.bin要确保对团队中的球员有效:

result <- lp("max",obj,con,dir,rhs,all.int = TRUE)
Success: the objective function is 450


roster <- df[result$solution[1:nrow(df)] == 1, ]
roster
# A tibble: 10 x 5
   avg_points  cost position team     id
        <int> <int> <chr>    <chr> <int>
 1         45    19 C        I        24
 2         45     5 P        X       126
 3         45    25 OF       N       139
 4         45    22 3B       J       193
 5         45    24 2B       B       327
 6         45    25 OF       P       340
 7         45    23 P        Q       356
 8         45    13 OF       N       400
 9         45    13 SS       L       401
10         45    45 1B       G       614
Run Code Online (Sandbox Code Playgroud)

如果您将数据更改为

N <- 1000

set.seed(123)
df <- tibble(avg_points = sample(5:45,N, replace = T),
             cost = sample(3:45,N, replace = T),
             position = sample(c("P","C","1B","2B","3B","SS","OF"),N, replace = T),
             team = sample(c("A", "B"),N, replace = T)) %>% 
  mutate(id = row_number())
Run Code Online (Sandbox Code Playgroud)

现在这将是不可行的,因为数据中的团队数少于3。

您可以检查它现在是否可以工作:

sort(unique(df$team))[result$solution[1027:1052]==1]
[1] "B" "E" "I" "J" "N" "P" "Q" "X"
sort(unique(roster$team))
[1] "B" "E" "I" "J" "N" "P" "Q" "X"
Run Code Online (Sandbox Code Playgroud)