调度优化以最小化时隙数量(有约束)

ldw*_*wii 5 python algorithm linear-programming job-scheduling or-tools

我正在研究调度优化问题,其中我们有一组需要在特定时间范围内完成的任务。

每个任务都有一个时间表,指定可以执行该任务的时间段列表。每个任务的时间表可能因工作日而异。

这是小样本(减少了任务和时间段的数量):

task_availability_map = {
    "T1" : [0, 0, 0, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0],
    "T2" : [0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
    "T3" : [0, 0, 0, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0],
    "T4" : [0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0],
    "T5" : [0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0],
    "T6" : [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0],
    "T7" : [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0],
    "T8" : [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0],
    "T9" : [0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0],
    "T10": [0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0]
}
Run Code Online (Sandbox Code Playgroud)

限制是同一时隙内最多只能并行执行 N 个任务(如果它们重叠)。无论完成 1 个还是 N 个任务,这组并行任务总是花费相同的时间。

目标是尽量减少时隙的数量。

我尝试过一种强力方法来生成所有时隙索引排列。对于给定排列中的每个索引,获取所有可以调度的任务并将它们添加到要在下一次迭代中排除的任务列表中。一旦完成给定排列的所有迭代,将时隙的数量和索引的组合添加到列表中。

def get_tasks_by_timeslot(timeslot, tasks_to_exclude):
    for task in task_availability_map.keys():
        if task in tasks_to_exclude:
            continue
        if task_availability_map[task][timeslot] == 1:
            yield task

total_timeslot_count = len(task_availability_map.values()[0]) # 17
timeslot_indices = range(total_timeslot_count)

timeslot_index_permutations = list(itertools.permutations(timeslot_indices))

possible_schedules = []

for timeslot_variation in timeslot_index_permutations:
    tasks_already_scheduled = []
    current_schedule = []
    for t in timeslot_variation:
        tasks = list(get_tasks_by_timeslot(t, tasks_already_scheduled))
        if len(tasks) == 0:
            continue
        elif len(tasks) > MAX_PARALLEL_TASKS:
            break
        tasks_already_scheduled += tasks
        current_schedule.append(tasks)

    time_slot_count = np.sum([len(t) for t in current_schedule])
    possible_schedules.append([time_slot_count, timeslot_variation])

...
Run Code Online (Sandbox Code Playgroud)

按时间段的数量对可能的时间表进行排序,这就是解决方案。然而,该算法的复杂性随着时隙的数量呈指数增长。鉴于有数百个任务和数百个时间段,我需要一种不同的方法。

有人建议使用 LP MIP(例如 Google OR Tools),但我对它不是很熟悉,并且很难在代码中制定约束。非常感谢任何有关 LP 或其他解决方案的帮助,可以帮助我朝着正确的方向开始(不一定是 Python,甚至可以是 Excel)。

Erw*_*gen 2

我对 MIP 模型的建议:

引入二元变量:

x(i,t) = 1 if task i is assigned to slot t
         0 otherwise

y(t) = 1 if slot t has at least one task assigned to it
       0 otherwise
Run Code Online (Sandbox Code Playgroud)

进一步令:

N = max number of tasks per slot
ok(i,t) = 1 if we are allowed to assign task i to slot t
          0 otherwise
Run Code Online (Sandbox Code Playgroud)

Then the model can look like:

minimize sum(t,y(t))                    (minimize used slots)    
sum(t, ok(i,t)*x(i,t)) = 1   for all i  (each task is assigned to exactly one slot)      
sum(i, ok(i,t)*x(i,t)) <= N  for all t  (capacity constraint for each slot)
y(t) >= x(i,t)  for all (i,t) such that ok(i,t)=1
x(i,t),y(t) in {0,1}                    (binary variables)
Run Code Online (Sandbox Code Playgroud)

Using N=3, I get a solution like:

----     45 VARIABLE x.L  assignment

                s5          s6          s7         s13

task1                    1.000
task2                                1.000
task3                    1.000
task4                    1.000
task5        1.000
task6                                            1.000
task7                                            1.000
task8                                            1.000
task9                                1.000
task10                               1.000
Run Code Online (Sandbox Code Playgroud)

The model is fairly simple and it should not be very difficult to code and solve it using your favorite MIP solver. The one thing you want to make sure is that only variables x(i,t) exist when ok(i,t)=1. In other words, make sure that variables do not appear in the model when ok(i,t)=0. It can help to interpret the assignment constraints as:

sum(t | ok(i,t)=1, x(i,t)) = 1   for all i  (each task is assigned to exactly one slot)      
sum(i | ok(i,t)=1, x(i,t)) <= N  for all t  (capacity constraint for each slot)
Run Code Online (Sandbox Code Playgroud)

where | means 'such that' or 'where'. If you do this right, your model should have 50 variables x(i,t) instead of 10 x 17 = 170. Furthermore we can relax y(t) to be continuous between 0 and 1. It will be either 0 or 1 automatically. Depending on the solver that may affect performance.

I have no reason to believe this is easier to model as a constraint programming model or that it is easier to solve that way. My rule of thumb is, if it is easy to model as a MIP stick to a MIP. If we need to go through lots of hoops to make it a proper MIP, and a CP formulation makes life easier, then use CP. In many cases this simple rule works quite well.