New-ish engine coder, would appreciate if someone could look at my code and point me in the right directions

jauska · Post by **jauska** » Thu Jul 02, 2020 9:27 am

Hey guys,

Like I said, I'm relatively new to this. My engine is doing some unexpected things that I'm struggling to get to the bottom of, and I would really appreciate if anyone could spend a little time guiding me here.

So my engine at the moment (in the form below) mostly seems to make the right moves, but seemingly for the wrong reasons.
For example, if I give it this position: "rn1qkb1r/ppp1pppp/8/3p2B1/3Pn1b1/4PN1P/PPP2PP1/RN1QKB1R b KQkq - 0 5" (pic of board), it will find Bxf3 (the correct move), but with this evaluation (+ is black in this case because it's black to move):

Code: Select all

Lines:
(-0.47) Nxg5: 6 hxg4 Nxf3 7 Qxf3
[b](-0.01) Bxf3: 6 Qxf3 Nxg5 7 Qxd5[/b]
(-0.01) Nxf2: 6 hxg4 Nxd1 7 Rxh7
(-0.01) Bxh3: 6 Rxh3 Nxf2
(-590.0) b5: 6 Bxb5
(-1.23) Qc8: 6 h4 Bxf3 7 gxf3 Nxg5 8 hxg5
(-590.0) Kd7: 6 Ne5

Bxf3 should really be evaluated around +2.3, and it thinks white will play Qxd5. I have no idea why, and I also don't know why the evaluation is stopping here because the Qsearch shouldn't end so early (black can recapture the white queen after that move).

The other thing is, I feel there should be a cutoff somewhere. I only recently converted this to negamax so I'm not sure exactly where/what it should be, and it's possible that it was wrong before anyway. At the moment I have this line:

Code: Select all

// CUTOFF
        //if(a >= beta) break;

If I uncomment that, basically everything goes wrong. Every line except the first one cuts off immediately and it makes completely wrong moves. I'm also not sure that the cutoff I have in the stand pat part is right?

By the way, this is also running quite slow (I think). I'm getting about 50,000 nodes per second, which seems ok but I've seen a couple of people posting here saying they were getting millions a second lol. I was originally getting 300-400k/s before I made it object oriented, but that still seems slow right? So if you see anything that "looks slow" let me know too.

This code is in a language called Dart. It's basically like Java in terms of syntax, there's nothing particularly unusual about it. I tried to clean it up a bit but if there's anything that you're unsure of the function of I can clear it up.
I'm aware that this is only partial code - there's no evaluation code/data structures etc here, I just didn't want to post 1500 lines of meta functions and confuse everyone further so I just included the relevant bits, but if you have some suspicion about what's going on and want to see another part let me know.

Again, thanks massively in advance for reading my code.

Pastebin link if it's easier:

Code: Select all

  // Iterative Deepening Search
  // Calls mtdf at each depth until we hit the depth limit or run out of time
  List itSearch(Position pos, int timeLimit){
    nodesSearched = 0;
    tpLookups = 0;
    tpScore.clear(); // clear the transposition table for scores but not for moves

    Stopwatch timer = new Stopwatch()..start();
    int startTime = timer.elapsedMilliseconds;
    Move bestMove;
    double f = 0;
    int maxDepthSearched = 0;

    for(int depth = 1; depth <= MAX_SEARCH_DEPTH; depth++){
      int nodesSoFar = nodesSearched;
      tpScore.clear();
      pos.stage = 0;
      maxDepthSearched = depth;
      List _l = mtdf(pos, f, depth, timer, timeLimit);
      f = _l[0];
      if(true) bestMove = tpMove[pos.hash]; // if(_l[1]) -> only uses moves computed by a complete mtdf evaluation

      double timeTaken = (timer.elapsedMilliseconds - startTime) / 1000;
      int nodesHere = nodesSearched - nodesSoFar;
      double nps = nodesHere/timeTaken;
      print("Searched depth $depth: $nodesHere nodes in ${timeTaken.toStringAsFixed(2)}s (${nps.toStringAsFixed(2)} NPS)");
      print("Best move so far is ${bestMove.name}");
      if(timer.elapsedMilliseconds - startTime > timeLimit){
        print("Time limit hit");
        break;
      }
    }

    // This part is just for formatting lines to print to the console
    Map lines = Map();
    for(Move m in pos.moves){
      String mvName = formatMove(m);
      lines[mvName] = [(pos.board[PLAYER_TURN] == WHITE)?"${pos.board[FULLMOVE_COUNTER]} ":"", m.endPosition.eval];
      Position p = m.endPosition;
      Move nextMove = p.bestMove;
      bool first = true;
      while(p != null && nextMove != null){
        String next = formatMoveForList(p.board, nextMove, first);
        first = false;
        lines[mvName][0] += next;
        //print("Checking ${nextMove.name}");
        p = nextMove.endPosition;
        //print(p.bestMove);
        if(p != null) nextMove = p.bestMove;
        if(nextMove == null){
          //print("next move after ${p.previousMove.name} is null");
          break;
        }
      }
    }
    double timeTaken = (timer.elapsedMilliseconds - startTime) / 1000;
    double nps = nodesSearched/timeTaken;
    print("Iterative search terminated at depth $maxDepthSearched, having searched $nodesSearched nodes in ${timeTaken.toStringAsFixed(2)}s (${nps.toStringAsFixed(2)} NPS)");
    print("$tpLookups transposition table lookups");
    return [bestMove, f, lines];
  }
  
  // Aske Plaat's MTD(f) algorithm
  // Calls abm (Alpha-Beta with Memory) a number of times with a gradually converging search window
  List mtdf(Position pos, double target, int depth, Stopwatch timer, int timeLimit){
    double currentEstimate = target;
    var upper = MATE_UPPER;
    var lower = -MATE_UPPER;
    int pass = 1;
    while(lower < upper){
      double beta = max(currentEstimate, lower + 0.01);
      currentEstimate = -abm(pos, 1-pos.board[PLAYER_TURN], depth, 0, beta - 0.01, beta, true, false);
      print("MTDF pass $pass (currentEstimate: $currentEstimate), beta: ${beta.toStringAsFixed(2)} lower: ${lower.toStringAsFixed(2)}, upper: ${upper.toStringAsFixed(2)}");
      if(currentEstimate < beta) upper = currentEstimate;
      else lower = currentEstimate;
      pass++;
      if(timer.elapsedMilliseconds > timeLimit){
        print("Time limit hit in mtdf");
        return [currentEstimate, false];
        break;
      }
    }
    return [currentEstimate, true];
  }

  // Alpha-Beta with Memory (Negamax)
  double abm(Position pos, int player, int depth, int stage, double alpha, double beta, bool root, [bool allowNull = true]){
    //print("ABM");
    nodesSearched++;

    // Check transposition tables haven't grown too large
    if(tpMove.length > MAX_TRANSPOSITIONS){
      print("Clearing transposition table for moves (${tpMove.length})");
      tpMove.clear();
    }
    if(tpScore.length > MAX_TRANSPOSITIONS){
      print("Clearing transposition table for scores (${tpScore.length})");
      tpScore.clear();
    }
    
    // retrieve stored values from transposition table
    List _tpScore = tpScore[pos.dhash];
    if(_tpScore != null){
      tpLookups++;
      if(_tpScore[0] >= beta) return _tpScore[0];
      if(_tpScore[1] <= alpha) return _tpScore[1];
      alpha = max(alpha, _tpScore[0]);
      beta = min(beta, _tpScore[1]);
      //print("($stage) tp lookup $alpha $beta");
    }else{
      tpScore[pos.dhash] = [-MATE_UPPER, MATE_UPPER];
    }
    
    double g;
    if(tpMove[pos.hash] == null) tpMove[pos.hash] = pos.moves[0];
    bool qSearch = (depth <= 0); // are we in quiescence search?
                                 // if so, only generate noisy moves (captures)
    List<Move> _moves = (!qSearch)?pos.moves:pos.moves.where((_m) => !_m.quiet).toList();
    //print("Moves: ${_moves.map((x) => x.name).toList()}");
    
    // Null-Move Heuristic
    // CURRENTLY DISABLED FOR SIMPLIITY / DEBUGGING
    if(allowNull){
      Move _nullMove = Move.nullMove(pos);
      double eval = -abm(_nullMove.endPosition, 1-player, depth - 3, stage + 1, -beta, -beta+0.0001, false, false);
      if(eval >= beta){
        pos.eval = eval;
        return eval;
      }
    }
    // ---------

    // -10 means 10 moves into quiescence search, increasing doesn't change much except massively slowing everything down
    if(_moves.length == 0 || depth <= -10) g = (player != pos.analysisPlayer)?pos.score:-pos.score; // if we've hit the end of the evaluation
    else{
      if(qSearch){ // STAND PAT
        double stand_pat = (player != pos.analysisPlayer)?pos.score:-pos.score;
        if(stand_pat >= beta) return beta;
        if(alpha < stand_pat) alpha = stand_pat;
      }
      
      g = -MATE_UPPER;
      double a = alpha;

      for(Move m in _moves){
        if(m.valid){
          m.endPosition.eval = -abm(m.endPosition, 1-player, depth - 1, stage + 1, -beta, -a, false, false);
          
          if(m.endPosition.eval > g){
            tpMove[pos.hash] = m;
            pos.bestMove = m;
            g = m.endPosition.eval;
          }

          if(g > a){
            a = g;
            tpMove[pos.hash] = m;
            pos.bestMove = m;
          }
        }

        // CUTOFF
        //if(a >= beta) break;
      }
    }

    if(g <= alpha) tpScore[pos.dhash][1] = g; // fail low implies an upper bound
    if(g > alpha && g < beta){ // found an accurate minimax value
      tpScore[pos.dhash] = [g,g];
    }
    if(g >= beta) tpScore[pos.dhash][0] = g; // fail high implies a lower bound

    return g;
  }

Ras · Post by **Ras** » Thu Jul 02, 2020 11:34 am

I think the problem is with the handling of the root aspiration window.

jauska · Post by **jauska** » Thu Jul 02, 2020 2:04 pm

Ras wrote: ↑Thu Jul 02, 2020 11:34 am I think the problem is with the handling of the root aspiration window.

Interesting - what do you think I should do with it?

jauska · Post by **jauska** » Thu Jul 02, 2020 3:10 pm

Update: I've converted everything to use integers now instead of floating point numbers. Hasn't changed anything apart from maybe slightly improving the speed (~10%).

Sven · Post by **Sven** » Thu Jul 02, 2020 4:56 pm

In mtdf(), maybe you could try

Code: Select all

currentEstimate = abm(...)

instead of

Code: Select all

currentEstimate = -abm(...)

since you are still at the root ...

Sven · Post by **Sven** » Thu Jul 02, 2020 4:58 pm

jauska wrote: ↑Thu Jul 02, 2020 2:04 pm
Ras wrote: ↑Thu Jul 02, 2020 11:34 am I think the problem is with the handling of the root aspiration window.
Interesting - what do you think I should do with it?

I think that part looks ok, it follows the MTDF algorithm (but see my other post regarding the call of abm()).

Sven · Post by **Sven** » Thu Jul 02, 2020 5:03 pm

jauska wrote: ↑Thu Jul 02, 2020 9:27 am

Code: Select all

          if(m.endPosition.eval > g){
            tpMove[pos.hash] = m;
            pos.bestMove = m;
            g = m.endPosition.eval;
          }

          if(g > a){
            a = g;
            tpMove[pos.hash] = m;
            pos.bestMove = m;
          }

You only have a PV (and therefore a best move) if its score is within the alpha/beta window. Therefore you should only store the best move if (g > a) but not if (...eval > g).

Ras · Post by **Ras** » Thu Jul 02, 2020 5:53 pm

jauska wrote: ↑Thu Jul 02, 2020 2:04 pmInteresting - what do you think I should do with it?

The -0.01 in the eval output looks suspiciously like the 0.01 in that part of the code. How do you notice if you get an unexpected fail-low or fail-high? The window isn't re-opened, and there's no assert for that case.

jauska · Post by **jauska** » Thu Jul 02, 2020 6:23 pm

Sven wrote: ↑Thu Jul 02, 2020 5:03 pm
jauska wrote: ↑Thu Jul 02, 2020 9:27 am
Code: Select all
          if(m.endPosition.eval > g){
            tpMove[pos.hash] = m;
            pos.bestMove = m;
            g = m.endPosition.eval;
          }

          if(g > a){
            a = g;
            tpMove[pos.hash] = m;
            pos.bestMove = m;
          }
You only have a PV (and therefore a best move) if its score is within the alpha/beta window. Therefore you should only store the best move if (g > a) but not if (...eval > g).

But it's actually impossible to get to g>a without ..eval > g first?

jauska · Post by **jauska** » Thu Jul 02, 2020 6:24 pm

Sven wrote: ↑Thu Jul 02, 2020 4:56 pm In mtdf(), maybe you could try
Code: Select all
currentEstimate = abm(...)
instead of
Code: Select all
currentEstimate = -abm(...)
since you are still at the root ...

The minus signs have been messing with my head too but this is actually right - when I swapped them it just started giving me the wrong move. I'm pretty sure everything at the top level (itSearch / mtdf) is right but something weird is happening in my abm.

edit: actually I've changed my mind, I think you're onto something here. It doesn't work if I write that as +abm, but that's the correct implementation of mtdf so it should.. which means something more significant is wrong

New-ish engine coder, would appreciate if someone could look at my code and point me in the right directions

New-ish engine coder, would appreciate if someone could look at my code and point me in the right directions

Re: New-ish engine coder, would appreciate if someone could look at my code and point me in the right directions

Re: New-ish engine coder, would appreciate if someone could look at my code and point me in the right directions

Re: New-ish engine coder, would appreciate if someone could look at my code and point me in the right directions

Re: New-ish engine coder, would appreciate if someone could look at my code and point me in the right directions

Re: New-ish engine coder, would appreciate if someone could look at my code and point me in the right directions

Re: New-ish engine coder, would appreciate if someone could look at my code and point me in the right directions

Re: New-ish engine coder, would appreciate if someone could look at my code and point me in the right directions

Re: New-ish engine coder, would appreciate if someone could look at my code and point me in the right directions

Re: New-ish engine coder, would appreciate if someone could look at my code and point me in the right directions