maj*_*jom 10 optimization r linear-programming
我想知道如何在R中设置一些基本匹配程序的例子.在各种编程语言中有很多例子,但我还没有找到一个很好的R例子.
假设我想让学生与项目相匹配,我会考虑在谷歌搜索这个问题时遇到的3种替代方法:
1)Bipartite匹配案例:我要求每个学生指出要处理的3个项目(没有说明3个项目中的任何偏好排名).
ID T.1 T.2 T.3 T.4 T.5 T.6 T.7
1 1 1 1 0 0 0 0
2 0 0 0 0 1 1 1
3 0 1 1 1 0 0 0
4 0 0 0 1 1 1 0
5 1 0 1 0 1 0 0
6 0 1 0 0 0 1 1
7 0 1 1 0 1 0 0
Run Code Online (Sandbox Code Playgroud)
-
d.1 <- structure(list(Student.ID = 1:7, Project.1 = c(1L, 0L, 0L, 0L,
1L, 0L, 0L), Project.2 = c(1L, 0L, 1L, 0L, 0L, 1L, 1L), Project.3 = c(1L,
0L, 1L, 0L, 1L, 0L, 1L), Project.4 = c(0L, 0L, 1L, 1L, 0L, 0L,
0L), Project.5 = c(0L, 1L, 0L, 1L, 1L, 0L, 1L), Project.6 = c(0L,
1L, 0L, 1L, 0L, 1L, 0L), Project.7 = c(0L, 1L, 0L, 0L, 0L, 1L,
0L)), .Names = c("Student.ID", "Project.1", "Project.2", "Project.3",
"Project.4", "Project.5", "Project.6", "Project.7"), class = "data.frame", row.names = c(NA,
-7L))
Run Code Online (Sandbox Code Playgroud)
2)匈牙利算法:我要求每个学生姓名3个项目工作在那里说明那些3中的偏好排名.据我所知,在这种情况下应用算法时的推理会是这样的:排名越低越好给学生"花钱".
ID T.1 T.2 T.3 T.4 T.5 T.6 T.7
1 3 2 1 na na na na
2 na na na na 1 2 3
3 na 1 3 2 na na na
4 na na na 1 2 3 na
5 2 na 3 na 1 na na
6 na 3 na na na 2 1
7 na 1 2 na 3 na na
Run Code Online (Sandbox Code Playgroud)
-
d.2 <- structure(list(Student.ID = 1:7, Project.1 = structure(c(2L, 3L,
3L, 3L, 1L, 3L, 3L), .Label = c("2", "3", "na"), class = "factor"),
Project.2 = structure(c(2L, 4L, 1L, 4L, 4L, 3L, 1L), .Label = c("1",
"2", "3", "na"), class = "factor"), Project.3 = structure(c(1L,
4L, 3L, 4L, 3L, 4L, 2L), .Label = c("1", "2", "3", "na"), class = "factor"),
Project.4 = structure(c(3L, 3L, 2L, 1L, 3L, 3L, 3L), .Label = c("1",
"2", "na"), class = "factor"), Project.5 = structure(c(4L,
1L, 4L, 2L, 1L, 4L, 3L), .Label = c("1", "2", "3", "na"), class = "factor"),
Project.6 = structure(c(3L, 1L, 3L, 2L, 3L, 1L, 3L), .Label = c("2",
"3", "na"), class = "factor"), Project.7 = structure(c(3L,
2L, 3L, 3L, 3L, 1L, 3L), .Label = c("1", "3", "na"), class = "factor")), .Names = c("Student.ID",
"Project.1", "Project.2", "Project.3", "Project.4", "Project.5",
"Project.6", "Project.7"), class = "data.frame", row.names = c(NA,
-7L))
Run Code Online (Sandbox Code Playgroud)
3)??? 方法:这应该与(2)非常相关.但是,我认为它可能是更好/更公平的方法(至少在示例的设置中).学生不能选择项目,他们甚至不了解项目,但他们对某项技能组合评定了他们的资格(1"不存在"到10"专业水平").此外,讲师还为每个项目评定了所需的技能组合.除了(2)之外,第一步是计算相似性矩阵,然后从上面运行优化程序.
PS: Programming Skills
SK: Statistical Knowledge
IE: Industry Experience
ID PS SK IE
1 10 9 8
2 1 2 10
3 10 2 5
4 2 5 3
5 10 2 10
6 1 10 1
7 5 5 5
Run Code Online (Sandbox Code Playgroud)
-
d.3a <- structure(list(Student.ID = 1:7, Programming.Skills = c(10L, 1L,
10L, 2L, 10L, 1L, 5L), Statistical.knowlegde = c(9L, 2L, 2L,
5L, 2L, 10L, 5L), Industry.Expertise = c(8L, 10L, 5L, 3L, 10L,
1L, 5L)), .Names = c("Student.ID", "Programming.Skills", "Statistical.knowlegde",
"Industry.Expertise"), class = "data.frame", row.names = c(NA,
-7L))
Run Code Online (Sandbox Code Playgroud)
-
T: Topic ID
PS: Programming Skills
SK: Statistical Knowledge
IE: Industry Experience
T PS SK IE
1 10 5 1
2 1 1 5
3 10 10 10
4 2 8 3
5 4 3 2
6 1 1 1
7 5 7 2
Run Code Online (Sandbox Code Playgroud)
-
d.3b <- structure(list(Project.ID = 1:7, Programming.Skills = c(10L,
1L, 10L, 2L, 4L, 1L, 5L), Statistical.Knowlegde = c(5L, 1L, 10L,
8L, 3L, 1L, 7L), Industry.Expertise = c(1L, 5L, 10L, 3L, 2L,
1L, 2L)), .Names = c("Project.ID", "Programming.Skills", "Statistical.Knowlegde",
"Industry.Expertise"), class = "data.frame", row.names = c(NA,
-7L))
Run Code Online (Sandbox Code Playgroud)
我很感激在R中实现这3种方法的任何帮助.谢谢.
更新:以下问题似乎是相关的,但没有一个显示如何在R中解决它: https://math.stackexchange.com/questions/132829/group-membership-assignment-by-preferences-optimization-problem https:/ /superuser.com/questions/467577/using-optimization-to-assign-by-preference
以下是使用二分匹配和匈牙利算法的可能解决方案。
我提出的使用二分匹配的解决方案可能不是您想要的。下面的所有代码所做的都是随机采样指定次数的迭代,之后有望识别出至少一个解决方案。这可能需要大量的迭代和很长时间来解决大问题。下面的代码在 200 次迭代内找到了示例问题的三个解决方案。
matrix1 <- matrix(c( 1, 1, 1, NA, NA, NA, NA,
NA, NA, NA, NA, 1, 1, 1,
NA, 1, 1, 1, NA, NA, NA,
NA, NA, NA, 1, 1, 1, NA,
1, NA, 1, NA, 1, NA, NA,
NA, 1, NA, NA, NA, 1, 1,
NA, 1, 1, NA, 1, NA, NA), nrow=7, byrow=TRUE)
set.seed(1234)
iters <- 200
my.match <- matrix(NA, nrow=iters, ncol=ncol(matrix1))
for(i in 1:iters) {
for(j in 1:nrow(matrix1)) {
my.match[i,j] <- sample(which(matrix1[j,] == 1), 1)
}
}
n.unique <- apply(my.match, 1, function(x) length(unique(x)))
my.match[n.unique==ncol(matrix1),]
# [,1] [,2] [,3] [,4] [,5] [,6] [,7]
# [1,] 3 7 4 6 1 2 5
# [2,] 1 7 4 5 3 6 2
# [3,] 3 5 4 6 1 7 2
Run Code Online (Sandbox Code Playgroud)
clue
这是使用包和solve_LSAP()
@jackStinger 建议的匈牙利算法的代码。为了做到这一点,我必须替换缺失的观察结果,并且我任意地将它们替换为 4。第 5 个人没有得到他们的第一选择,第 7 个人没有得到他们的三个选择中的任何一个。
library(clue)
matrix1 <- matrix(c( 3, 2, 1, 4, 4, 4, 4,
4, 4, 4, 4, 1, 2, 3,
4, 1, 3, 2, 4, 4, 4,
4, 4, 4, 1, 2, 3, 4,
2, 4, 3, 4, 1, 4, 4,
4, 3, 4, 4, 4, 2, 1,
4, 1, 2, 4, 3, 4, 4), nrow=7, byrow=TRUE)
matrix1
solve_LSAP(matrix1, maximum = FALSE)
# Optimal assignment:
# 1 => 3, 2 => 5, 3 => 2, 4 => 4, 5 => 1, 6 => 7, 7 => 6
Run Code Online (Sandbox Code Playgroud)
以下是显示匈牙利算法如何工作的网站链接:http://www.wikihow.com/Use-the-Hungarian-Algorithm
编辑:2014 年 6 月 5 日
这是我对优化第三种场景的第一次尝试。我将每个学生随机分配到一个项目,然后计算该组作业的成本。成本是通过找出学生的技能与项目所需技能之间的差异来计算的。将这些差异的绝对值相加即可得出七项任务的总成本。
下面我将这个过程重复 10,000 次,并确定这 10,000 次分配中哪一次的成本最低。
另一种方法是对所有可能的学生项目作业进行详尽的搜索。
随机搜索和详尽搜索都不太可能是您想要的。然而,前者给出近似最优解,后者给出精确最优解。
我稍后可能会回到这个问题。
students <- matrix(c(10, 9, 8,
1, 2, 10,
10, 2, 5,
2, 5, 3,
10, 2, 10,
1, 10, 1,
5, 5, 5), nrow=7, ncol=3, byrow=TRUE)
projects <- matrix(c(10, 5, 1,
1, 1, 5,
10, 10, 10,
2, 8, 3,
4, 3, 2,
1, 1, 1,
5, 7, 2), nrow=7, ncol=3, byrow=TRUE)
iters <- 10000
# col = student, cell = project
assignments <- matrix(NA, nrow=iters, ncol=nrow(students))
for(i in 1:iters) {
assignments[i,1:7] <- sample(7,7,replace=FALSE)
}
cost <- matrix(NA, nrow=iters, ncol=nrow(students))
for(i in 1:iters) {
for(j in 1:nrow(students)) {
student <- j
project <- assignments[i,student]
student.cost <- rep(NA,3)
for(k in 1:3) {
student.cost[k] <- abs(students[student,k] - projects[project,k])
}
cost[i,j] <- sum(student.cost)
}
}
total.costs <- rowSums(cost)
assignment.costs <- cbind(assignments, total.costs)
head(assignment.costs)
assignment.costs[assignment.costs[,8]==min(assignment.costs[,8]),]
# total.costs
# [1,] 3 2 1 4 5 6 7 48
# [2,] 3 2 1 6 5 4 7 48
# [3,] 3 2 1 6 5 4 7 48
# student 1, project 3, cost = 3
# student 2, project 2, cost = 6
# student 3, project 1, cost = 7
# student 4, project 4, cost = 3
# student 5, project 5, cost = 15
# student 6, project 6, cost = 9
# student 7, project 7, cost = 5
3+6+7+3+15+9+5
# [1] 48
Run Code Online (Sandbox Code Playgroud)
编辑:2014 年 6 月 6 日
这是详尽的搜索。给七名学生分配项目的可能方法只有 5040 种。此搜索返回四个最佳解决方案:
students <- matrix(c(10, 9, 8,
1, 2, 10,
10, 2, 5,
2, 5, 3,
10, 2, 10,
1, 10, 1,
5, 5, 5), nrow=7, ncol=3, byrow=TRUE)
projects <- matrix(c(10, 5, 1,
1, 1, 5,
10, 10, 10,
2, 8, 3,
4, 3, 2,
1, 1, 1,
5, 7, 2), nrow=7, ncol=3, byrow=TRUE)
library(combinat)
n <- nrow(students)
assignments <- permn(1:n)
assignments <- do.call(rbind, assignments)
dim(assignments)
# column of assignments = student
# row of assignments = iteration
# cell of assignments = project
cost <- matrix(NA, nrow=nrow(assignments), ncol=n)
for(i in 1:(nrow(assignments))) {
for(student in 1:n) {
project <- assignments[i,student]
student.cost <- rep(NA,3)
for(k in 1:3) {
student.cost[k] <- abs(students[student,k] - projects[project,k])
}
cost[i,student] <- sum(student.cost)
}
}
total.costs <- rowSums(cost)
assignment.costs <- cbind(assignments, total.costs)
head(assignment.costs)
assignment.costs[assignment.costs[,(n+1)]==min(assignment.costs[,(n+1)]),]
total.costs
[1,] 3 2 5 4 1 6 7 48
[2,] 3 2 5 6 1 4 7 48
[3,] 3 2 1 6 5 4 7 48
[4,] 3 2 1 4 5 6 7 48
Run Code Online (Sandbox Code Playgroud)