python统计分析

use*_*225 1 python statistics

鉴于15名球员 - 2名守门员,5名后卫,5名中场球员和3名前锋,以及每名球员都有价值和得分的事实,我想计算得分最高的球队.每个团队必须包含1个GK,然后是一个阵型,例如4:4:2,4:3:3等.我开始使用这样的样本数据

球员角色成本

然后,我做了以下评估所有组合

将每一行读入一个列表(对于每个角色),然后在嵌套运行中使用itertools来获取所有组合

if line[1] == "G": G.append(line[0])
if line[1] == "D": D.append(line[0])
if line[1] == "M": M.append(line[0])
if line[1] == "S": S.append(line[0])

for gk in itertools.combinations(G,1):
    for de in itertools.combinations(D,4):
        for mi in itertools.combinations(M,4):
            for st in itertools.combinations(S,2):
                teams[str(count)]= " ".join(gk)+" "+" ".join(de)+" "+" ".join(mi)+" "+" ".join(st)
                count +=1
Run Code Online (Sandbox Code Playgroud)

有了团队,我计算他们的积分值和团队成本.如果它低于阈值,我打印它.
但如果我现在让这20名守门员,150名防守球员,150名中场球员和100名前锋,我可以理解为失去记忆.
我该怎么做才能进行这种分析?它是一个生成器而不是我需要的递归函数吗?

非常感谢

unu*_*tbu 5

您可以通过递归解决此问题.以下显示了基本概要,但忽略了团队由特定数量的特定类型的玩家组成的细节.

players=[{'name':'A','score':5,'cost':10},
         {'name':'B','score':10,'cost':3},
         {'name':'C','score':6,'cost':8}]

def player_cost(player):
    return player['cost']
def player_score(player):
    return player['score']
def total_score(players):
    return sum(player['score'] for player in players)

def finance_team_recurse(budget, available_players):
    affordable_players=[]
    for player in available_players:
        if player_cost(player)<=budget:
            # Since we've ordered available players, the first player appended
            # will be the one with the highest score.
            affordable_players.append(player)
    result=[]
    if affordable_players:
        candidate_player=affordable_players[0]
        other_players=affordable_players[1:]
        # if you include candidate_player on your team
        team_with_candidate=finance_team_recurse(budget-player_cost(candidate_player),
                                                 other_players)
        team_with_candidate.append(candidate_player)
        score_of_team_with_candidate=total_score(team_with_candidate)
        if score_of_team_with_candidate>total_score(other_players):
            result=team_with_candidate
        else:
            # if you exclude candidate_player from your team
            team_without_candidate=finance_team_recurse(budget, other_players)
            score_of_team_without_candidate=total_score(team_without_candidate)
            if score_of_team_with_candidate>score_of_team_without_candidate:
                result=team_with_candidate
            else:
                result=team_without_candidate
    return result

def finance_team(budget, available_players):
    tmp=available_players[:]
    # Sort so player with highest score is first. (Greedy algorithm?)
    tmp.sort(key=player_score, reverse=True)
    return finance_team_recurse(budget,tmp)

print(finance_team(20,players))
# [{'score': 6, 'cost': 8, 'name': 'C'}, {'score': 10, 'cost': 3, 'name': 'B'}]
Run Code Online (Sandbox Code Playgroud)
20 choose 1 = 20 combinations
150 choose 4 = 20260275 combinations
100 choose 2 = 4950 combinations
Run Code Online (Sandbox Code Playgroud)

因此在teamsdict中总共有20*20260275*20260275*4950 = 40637395564486875000L项.这需要大量的记忆.

for gk in itertools.combinations(G,1):
    for de in itertools.combinations(D,4):
        for mi in itertools.combinations(M,4):
            for st in itertools.combinations(S,2):    
                #Don't collect the results into a dict.
                #That's what's killing you (memory-wise).
                #Just compute the cost and
                #Just print the result here.
Run Code Online (Sandbox Code Playgroud)

PS.40637395564486875000L是大约的10**19.假设您的程序10**6每秒可以处理组合,程序完成需要大约130万年......