排序算法实现最高总组合

ran*_*its 29 sorting algorithm

我有兴趣了解哪种算法最适合解决以下问题.这是标准.

我有一个球员名单.每个玩家都有三个属性.

  • 球队
  • 位置

每支球队约有20名球员.比方说,我们在这里谈论棒球和位置1B,2B,3B,SS,C,OF.

我感兴趣的是对顶级N组合进行排序,其中队友的N > 0组合价值最高4.

所以每个组合4 players都必须在同一个团队中.每个组合4 players应该具有更高的总和Value然后每个后续组合.

一个限制是每个位置每个组合只能使用一次,除了OF在4个玩家的一个组合中最多可以使用三次.

所以在下面的组成球员中,我将展示前两个组合:

Team Toronto
Player 1, SS, 1.0
Player 2, 1B, 1.0
Player 3, 1B, 2.0
Player 4, 2B, 2.0
Player 5, 3B, 4.0
Player 6, 3B, 3.0
Player 7, C, 4.0
Player 8, OF, 1.0
Player 9, OF, 2.0
Player 10, OF, 5.0
Player 11, OF, 6.0

Team Washington
Team Toronto
Player 1, SS, 3.0
Player 2, 1B, 2.0
Player 3, 1B, 1.0
Player 4, 2B, 2.0
Player 5, 3B, 2.0
Player 6, 3B, 3.0
Player 7, C, 7.0
Player 8, OF, 1.0
Player 9, OF, 2.0
Player 10, OF, 2.0
Player 11, OF, 3.0
Run Code Online (Sandbox Code Playgroud)

预计最高的组合将是

Team Toronto: 
   * Player 11
   * Player 10
   * Player 7
   * Player 5 
Run Code Online (Sandbox Code Playgroud)

总价值 19.0

预计第二高的组合也将是多伦多队

* Player 11
* Player 10
* Player 7
* Player 6 
Run Code Online (Sandbox Code Playgroud)

共有 18.0

华盛顿队的最高组合是球员1,球员6,球员7和球员11,他们的价值只有16.0这样他们才能在排序之后出现.

如果说我每天要处理500名玩家(分布在10个或更多团队中),那么以高效的方式处理这个问题的最佳算法是什么?

tob*_*s_k 8

您可以使用A*算法.作为启发式功能,将已经选择的玩家的成本加上完成团队所需的剩余玩家的最大成本.

这是一些Python代码.首先,设置:

def split(team):
    return [(x.strip(), y.strip(), float(z)) for x, y, z 
            in (line.split(",") for line in team.splitlines())]

team1 = ("Toronto",
        split("""Player 1, SS, 1.0
        Player 2, 1B, 1.0
        Player 3, 1B, 2.0
        Player 4, 2B, 2.0
        Player 5, 3B, 4.0
        Player 6, 3B, 3.0
        Player 7, C, 4.0
        Player 8, OF, 1.0
        Player 9, OF, 2.0
        Player 10, OF, 5.0
        Player 11, OF, 6.0"""))

team2 = ("Washington",
        split("""Player 1, SS, 3.0
        Player 2, 1B, 2.0
        Player 3, 1B, 1.0
        Player 4, 2B, 2.0
        Player 5, 3B, 2.0
        Player 6, 3B, 3.0
        Player 7, C, 7.0
        Player 8, OF, 1.0
        Player 9, OF, 2.0
        Player 10, OF, 2.0
        Player 11, OF, 3.0"""))

teams = [team1, team2]
Run Code Online (Sandbox Code Playgroud)

我们还需要一个函数来判断某个组合是否有效.我们不需要检查没有玩家出现两次,因为它们在构造方面是独一无二的,所以我们只知道每个玩家位置出现的频率.

ALLOWED = {"1B": 1, "2B": 1, "3B": 1, "SS": 1, "C": 1, "OF": 3}

def legal(players):
    d = {}
    for p in players:
        d[p[1]] = d.get(p[1], 0) + 1
    return all(d[x] <= ALLOWED[x] for x in d)
Run Code Online (Sandbox Code Playgroud)

现在有趣的部分:A*搜索.我们使用堆或优先级队列,堆中的每个条目都是一个元组(estimated_cost, players, position),因此具有最高总估计开销的(部分)团队将首先在堆中(我们使用-cost,因为堆将从最少到最大).pos告诉我们我们在球员名单中的当前位置,按个人成本排序,先排名最高.

然后我们从堆中弹出下一个元素,检查它是否是一个解决方案,yield结果是1),或者向团队添加另一个玩家,用当前团队的成本更新估计成本,填充下一个昂贵的来自大部分剩余玩家的玩家(new_players + team_players[i+1:i+more_needed]),并将该组合添加到堆中.

def generator(team_players, num):
    team_players = sorted(team_players, key=lambda p: p[2], reverse=True)
    queue = [(-sum(p[2] for p in team_players[:num]), [], 0)]
    while queue:
        estimated_cost, players, pos = heapq.heappop(queue)
        more_needed = num - len(players)
        if not more_needed:
            yield estimated_cost, players
        else:
            for i in range(pos, len(team_players) - more_needed + 1):
                player = team_players[i]
                new_players = players + [player]
                if legal(new_players):
                    estimate = -sum(p[2] for p in new_players + team_players[i+1:i+more_needed])
                    heapq.heappush(queue, (estimate, new_players, i+1))
Run Code Online (Sandbox Code Playgroud)

最后,我们有第二个生成器函数,将上面的生成器映射到团队名称,并为每个团队(如果有的话)一个接一个地生成下一个最佳解决方案.另一个堆用于存储每个团队当前最佳的解决方案,按总成本排序.

def generate_all(teams):
    generators = {name: generator(players, 4) for name, players in teams}
    best_teams = [(next(generators[name]), name) for name in generators]
    heapq.heapify(best_teams)
    while best_teams:
        team, name = heapq.heappop(best_teams)
        yield name, team
        try:
            heapq.heappush(best_teams, (next(generators[name]), name))
        except StopIteration:
            pass
Run Code Online (Sandbox Code Playgroud)

现在只需从该生成器生成并打印前几个:

all_teams_generator = generate_all(teams)
for _ in range(10):
    print(next(all_teams_generator))
Run Code Online (Sandbox Code Playgroud)

输出格式(team-name, (-cost, [(player1), ..., (player4)])):

('Toronto', (-19.0, [('Player 11', 'OF', 6.0), ('Player 10', 'OF', 5.0), ('Player 5', '3B', 4.0), ('Player 7', 'C', 4.0)]))
('Toronto', (-18.0, [('Player 11', 'OF', 6.0), ('Player 10', 'OF', 5.0), ('Player 7', 'C', 4.0), ('Player 6', '3B', 3.0)]))
('Toronto', (-17.0, [('Player 11', 'OF', 6.0), ('Player 10', 'OF', 5.0), ('Player 5', '3B', 4.0), ('Player 3', '1B', 2.0)]))
('Toronto', (-17.0, [('Player 11', 'OF', 6.0), ('Player 10', 'OF', 5.0), ('Player 5', '3B', 4.0), ('Player 4', '2B', 2.0)]))
('Toronto', (-17.0, [('Player 11', 'OF', 6.0), ('Player 10', 'OF', 5.0), ('Player 5', '3B', 4.0), ('Player 9', 'OF', 2.0)]))
('Toronto', (-17.0, [('Player 11', 'OF', 6.0), ('Player 10', 'OF', 5.0), ('Player 7', 'C', 4.0), ('Player 3', '1B', 2.0)]))
('Toronto', (-17.0, [('Player 11', 'OF', 6.0), ('Player 10', 'OF', 5.0), ('Player 7', 'C', 4.0), ('Player 4', '2B', 2.0)]))
('Toronto', (-17.0, [('Player 11', 'OF', 6.0), ('Player 10', 'OF', 5.0), ('Player 7', 'C', 4.0), ('Player 9', 'OF', 2.0)]))
('Toronto', (-16.0, [('Player 11', 'OF', 6.0), ('Player 10', 'OF', 5.0), ('Player 5', '3B', 4.0), ('Player 1', 'SS', 1.0)]))
('Toronto', (-16.0, [('Player 11', 'OF', 6.0), ('Player 10', 'OF', 5.0), ('Player 5', '3B', 4.0), ('Player 2', '1B', 1.0)]))
Run Code Online (Sandbox Code Playgroud)

附录:关于表现.对于小团队来说,产生所有组合并选择最好的组合,就像彼得的答案一样,可能确实"足够好",当然也简单得多.但是,使用A*的这个版本可以更快,如果你有更大的团队,或者更多的不同团队,或者你必须经常这样做,可能值得一试.这是一些时间信息,使用两种算法生成前十名团队.

>>> timeit.timeit("allcomb.main(10)", "import allcomb", number=1000)
6.20834589005
>>> timeit.timeit("astar.main(10)", "import astar", number=1000)
1.55367398262
Run Code Online (Sandbox Code Playgroud)

请注意,这是1000次迭代,因此对于单次迭代,两种算法都只需要几毫秒.另请注意,由于A*的生成性质,您需要的组合越少,速度就越快.对于最好的组合,它的速度要快6倍,而前10名的速度要快4倍,但如果你想要所有的组合,A*实际上比仅仅生成所有组合要慢三倍.


1)请注意,这是使用生成器函数,这些函数对于Python来说是非常独特的,但您可以使用将队列封装为成员并提供方法的类来执行相同的操作,并返回下一个有效的解决方案.如果你有兴趣,我也可以这样重写.


小智 7

您可以在每个团队中按递减顺序对玩家进行排序,然后迭代投掷排序列表.代码看起来像这样:

int highestCombinedValue = 0, i = 0, currentNumber = 0, count1B = 0, count2B = 0, cpunt3B = 0, countSS =0, countC = 0, countOF = 0;
while(currentNumber < 4 && list.get(i) != null)
{
  currentPlayer = list.get(i);
  switch currentPlayer.position
  {
    case: 1B
    then if(count1B < 1)
    {
      highestCombinedValue = highestCombinedValue + currentPlayer.value;
      count1B = count1B + 1;
    }
    break;
    case: 2B
    ...
    case: OF
    then if(count1B < 3)
    {
      highestCombinedValue = highestCombinedValue + currentPlayer.value;
      count1B = count1B + 1;
    }
    break;
  }
  i = i + 1;
}
Run Code Online (Sandbox Code Playgroud)

您可以将包含结果的团队存储在列表中,并按降序对此结果列表进行排序,以获得顶级团队.