如何实现高效的Alpha-Beta修剪游戏搜索树?

chR*_*NaN 0 java android artificial-intelligence game-theory alpha-beta-pruning

我正在努力学习人工智能以及如何在程序中实现它.最简单的起点可能是简单的游戏(在这种情况下是Tic-Tac-Toe)和游戏搜索树(递归调用;不是实际的数据结构).在关于该主题的讲座上发现了这个非常有用的视频.

我遇到的问题是第一次调用算法需要花费很长的时间(大约15秒)来执行.我已经在整个代码中放置了调试日志输出,似乎它调用了算法的一部分过多次.

这是为计算机选择最佳移动的方法:

    public Best chooseMove(boolean side, int prevScore, int alpha, int beta){
    Best myBest = new Best(); 
    Best reply;

    if (prevScore == COMPUTER_WIN || prevScore == HUMAN_WIN || prevScore == DRAW){
        myBest.score = prevScore;
        return myBest;
    }

    if (side == COMPUTER){
        myBest.score = alpha;
    }else{
        myBest.score = beta;
    }
    Log.d(TAG, "Alpha: " + alpha + " Beta: " + beta + " prevScore: " + prevScore);
    Move[] moveList = myBest.move.getAllLegalMoves(board);
    for (Move m : moveList){
        String choice;
        if (side == HUMAN){
            choice = playerChoice;
        }else if (side == COMPUTER && playerChoice.equals("X")){
            choice = "O";
        }else{
            choice = "X";
        }
        Log.d(TAG, "Current Move: column- " + m.getColumn() + " row- " + m.getRow());
        int p = makeMove(m, choice, side);
        reply = chooseMove(!side, p, alpha, beta);
        undoMove(m);
        if ((side == COMPUTER) && (reply.score > myBest.score)){
            myBest.move = m;
            myBest.score = reply.score;
            alpha = reply.score;
        }else if((side == HUMAN) && (reply.score < myBest.score)){
            myBest.move = m;
            myBest.score = reply.score;
            beta = reply.score;
        }//end of if-else statement
        if (alpha >= beta) return myBest;
    }//end of for loop
    return myBest;
}
Run Code Online (Sandbox Code Playgroud)

makeMove如果该点为空并且返回一个值(-1 - 人类获胜,0 - 抽奖,1 - 计算机获胜,-2或2 - 否则)该方法进行移动.虽然我相信错误可能在getAllLegalMoves方法中:

    public Move[] getAllLegalMoves(String[][] grid){
    //I'm unsure whether this method really belongs in this class or in the grid class, though, either way it shouldn't matter.
    items = 0;
    moveList = null;
    Move move = new Move();

    for (int i = 0; i < 3; i++){
        for(int j = 0; j < 3; j++){
            Log.d(TAG, "At Column: " + i + " At Row: " + j);
            if(grid[i][j] == null || grid[i][j].equals("")){
                Log.d(TAG, "Is Empty");
                items++;
                if(moveList == null || moveList.length < items){
                    resize();
                }//end of second if statement
                move.setRow(j);
                move.setColumn(i);
                moveList[items - 1] = move;
            }//end of first if statement
        }//end of second loop
    }//end of first loop
    for (int k = 0; k < moveList.length; k++){
        Log.d(TAG, "Count: " + k + " Column: " + moveList[k].getColumn() + " Row: " + moveList[k].getRow());
    }
    return moveList;
}

private void resize(){
    Move[] b = new Move[items];
    for (int i = 0; i < items - 1; i++){
        b[i] = moveList[i];
    }
    moveList = b;
}
Run Code Online (Sandbox Code Playgroud)

总结一下:是什么引起了我的呼唤,选择最好的举动,花了这么长时间?我错过了什么?有没有更简单的方法来实现此算法?任何帮助或建议将不胜感激,谢谢!

Pat*_*shu 7

具有alpha beta修剪的minimax树应该可视化为树,树的每个节点都是可能的移动,许多转向未来,其子节点可以从中获取所有移动.

为了尽可能快,并保证你只需要前方向移动的数量空间线性,你可以进行深度优先搜索并从一侧进行"扫描".如果您想象整个树正在构建中,您的程序实际上只会一次构建一个从一个到一个根的单个链,并丢弃它完成的任何部分.

我现在要复制wikipedia伪代码,因为它真的非常简洁明了:

function alphabeta(node, depth, ?, ?, Player)         
    if  depth = 0 or node is a terminal node
        return score
    if  Player = MaxPlayer
        for each child of node
            ? := max(?, alphabeta(child, depth-1, ?, ?, not(Player) ))     
            if ? ? ?
                break                             (* Beta cut-off *)
        return ?
    else
        for each child of node
            ? := min(?, alphabeta(child, depth-1, ?, ?, not(Player) ))     
            if ? ? ?
                break                             (* Alpha cut-off *)
        return ?
Run Code Online (Sandbox Code Playgroud)

笔记:

- '对于每个节点的子节点' - 不是编辑当前板的状态,而是创建一个全新的板,这是应用移动的结果.通过使用不可变对象,您的代码将更不容易出错并且通常可以更快地进行推理.

- 要使用此方法,请调用它以获取当前状态下可能进行的每个可能的移动,给予深度-1,-Afinity为alpha和+ Infinity为beta,它应该从每个非移动玩家的转弯开始这些调用 - 返回最高值的调用是最好的调用.

它在概念上非常简单.如果您正确编码,那么您永远不会同时实例化多个(深度)板,您永远不会考虑无意义的分支等等.