Devlog of Leorik

Discussion of chess software programming and technical issues.

Moderator: Ras

Modern Times
Posts: 3703
Joined: Thu Jun 07, 2012 11:02 pm

Re: Devlog of Leorik

Post by Modern Times »

Even with the error margins and low number of games, it looks to be a success from Graham's games on CCRL 40/15, +76 Elo

http://ccrl.chessdom.com/ccrl/4040/cgi/ ... +opponents
User avatar
lithander
Posts: 915
Joined: Sun Dec 27, 2020 2:40 am
Location: Bremen, Germany
Full name: Thomas Jahn

Re: Devlog of Leorik

Post by lithander »

Mike Sherwin wrote: Tue Dec 27, 2022 7:02 pm I read that Leorik's eval is sort of like a tiny NN (a poor man's NN). A full size NN also incorporates the square the kings are on. This may be done in Leorik by using 15x15 piece square tables where the king is always in the center. Then the pst is still used as an 8x8 table by pointing to the cell in the 15x15 table that places the kings on the 8x8 table at its current square. Jonathan tried this idea in Winter but only got positive result for the bishop table. Still it was worth +20 elo!
Technically PSQTs is already a one-layer network (or perceptron) but I think in the chess programming world when you mention NN people think about the big NNUEs. Unless you have a multi-layer network that can learn to evaluate a position based on how the pieces are in relation to each other I think the term "neural network" creates more confusion than that it helps.

But I like your idea! I've thought about how to make my eval more aware of the king position. The obvious idea was to introduce 64x the amount of weights (a set per king position) or even 4096x the amount of weights (a set per king-pair) but that felt like overkill, so I thought about defining 16 regions or something like that. But your idea is much simpler! At least conceptually, but I'm worried about the performance impact. Reading from the right "set" of weights would allow me to keep the current performance while your idea adds a considerable overhead. So is it worth it? I guess the only way to answer that is to implement it and try it out.

And the good thing is I'm motivated again to continue on Leorik. But I'm at a point where I've begun to ask myself if it's not time to consider real neural networks? I use already a machine-learning approach. Am I just wasting time when I try to improve my current eval base don a big linear function instead of considering the NNUE-approach even though it's been proven time and time again to be the most powerful technique available? And I just recently realized that it doesn't have to be a massive net and billions of positions for training like in Stockfish to be viable. It sounds almost doable with what I already have, without relying on external training data (which I just weened myself off from) or better engines for labeling...

So, what do you guys think? Should I continue to squeeze more out of my current approach or dip my toes into "real" NN stuff? (Consider it a poll! but unlike Elon I'm not saying that I'll abide by the results, haha)
Minimal Chess (simple, open source, C#) - Youtube & Github
Leorik (competitive, in active development, C#) - Github & Lichess
Mike Sherwin
Posts: 965
Joined: Fri Aug 21, 2020 1:25 am
Location: Planet Earth, Sol system
Full name: Michael J Sherwin

Re: Devlog of Leorik

Post by Mike Sherwin »

lithander wrote: Sun Jan 01, 2023 11:08 pm
Mike Sherwin wrote: Tue Dec 27, 2022 7:02 pm I read that Leorik's eval is sort of like a tiny NN (a poor man's NN). A full size NN also incorporates the square the kings are on. This may be done in Leorik by using 15x15 piece square tables where the king is always in the center. Then the pst is still used as an 8x8 table by pointing to the cell in the 15x15 table that places the kings on the 8x8 table at its current square. Jonathan tried this idea in Winter but only got positive result for the bishop table. Still it was worth +20 elo!
Technically PSQTs is already a one-layer network (or perceptron) but I think in the chess programming world when you mention NN people think about the big NNUEs. Unless you have a multi-layer network that can learn to evaluate a position based on how the pieces are in relation to each other I think the term "neural network" creates more confusion than that it helps.

But I like your idea! I've thought about how to make my eval more aware of the king position. The obvious idea was to introduce 64x the amount of weights (a set per king position) or even 4096x the amount of weights (a set per king-pair) but that felt like overkill, so I thought about defining 16 regions or something like that. But your idea is much simpler! At least conceptually, but I'm worried about the performance impact. Reading from the right "set" of weights would allow me to keep the current performance while your idea adds a considerable overhead. So is it worth it? I guess the only way to answer that is to implement it and try it out.

And the good thing is I'm motivated again to continue on Leorik. But I'm at a point where I've begun to ask myself if it's not time to consider real neural networks? I use already a machine-learning approach. Am I just wasting time when I try to improve my current eval base don a big linear function instead of considering the NNUE-approach even though it's been proven time and time again to be the most powerful technique available? And I just recently realized that it doesn't have to be a massive net and billions of positions for training like in Stockfish to be viable. It sounds almost doable with what I already have, without relying on external training data (which I just weened myself off from) or better engines for labeling...

So, what do you guys think? Should I continue to squeeze more out of my current approach or dip my toes into "real" NN stuff? (Consider it a poll! but unlike Elon I'm not saying that I'll abide by the results, haha)
With NNUE we are getting to the point where there are no interesting new engines. But if you can create a personality within an NNUE like a Fischer or an Alekhine and we can play against any great player of the past or even the present I'd say go for it. As for the 15 x 15 idea, why not, how long could it take? Maybe an NNUE using 15 x 15 is possible allowing the NNUE to be smaller. And it would be something new. However, the idea that will destroy NNUE when someone finally gets around to it is real time learning looking deep beyond the horizon to the end of the game. So do you follow the crowd or do you blaze a new path?
Mike Sherwin
Posts: 965
Joined: Fri Aug 21, 2020 1:25 am
Location: Planet Earth, Sol system
Full name: Michael J Sherwin

Re: Devlog of Leorik

Post by Mike Sherwin »

Idea: Do you use your pstbl's to help with LMR? If a piece moves from a low value square to a high value square should it still be a candidate for LMR? Or h to l should the LMR reduction be more aggressive?
User avatar
j.t.
Posts: 263
Joined: Wed Jun 16, 2021 2:08 am
Location: Berlin
Full name: Jost Triller

Re: Devlog of Leorik

Post by j.t. »

lithander wrote: Sun Jan 01, 2023 11:08 pm So, what do you guys think? Should I continue to squeeze more out of my current approach or dip my toes into "real" NN stuff? (Consider it a poll! but unlike Elon I'm not saying that I'll abide by the results, haha)
+1 from me for "handcrafted" eval. No very strong reason, except that maybe it would be a contrast to most of the other new engines. But I guess that for Leoriks playing strength NNUE would be the obvious choice.
User avatar
emadsen
Posts: 440
Joined: Thu Apr 26, 2012 1:51 am
Location: Oak Park, IL, USA
Full name: Erik Madsen

Re: Devlog of Leorik

Post by emadsen »

Excellent progress Thomas! I'm glad you had success tuning your evaluation from Leorik's own games. I need to add evaluation terms to MadChess to catch Leorik, lol. It's never-ending, isn't it?

Regarding your question about hand-crafted versus neural net evaluation, that's really up to you. Hand-crafted enables you to understand why your engine prefers one position over another. Indeed, you can add a non-standard UCI command that displays the static score of a given position, broken down by contributing terms (material, passed pawns, piece location, piece mobility, etc). Neural nets are a black box, but likely stronger.

Do whatever aligns with your preferences and goals for your project.
Erik Madsen | My C# chess engine: https://www.madchess.net
User avatar
lithander
Posts: 915
Joined: Sun Dec 27, 2020 2:40 am
Location: Bremen, Germany
Full name: Thomas Jahn

Re: Devlog of Leorik

Post by lithander »

Thanks for your input, everyone!
Mike Sherwin wrote: Mon Jan 02, 2023 12:01 am With NNUE we are getting to the point where there are no interesting new engines. [...] However, the idea that will destroy NNUE when someone finally gets around to it is real time learning looking deep beyond the horizon to the end of the game. So do you follow the crowd or do you blaze a new path?
I don't know if NNUE is a one-way road that prevents interesting things from being done. That Komodo Dragon uses MCTS in combination with alphabeta and NNUE eval is pretty interesting, isn't it? I think that if you want to improve upon NNUE you need to understand it first... and the way I like to prove to myself that I understand something is to see if I can build it.
j.t. wrote: Mon Jan 02, 2023 2:12 pm +1 from me for "handcrafted" eval. No very strong reason, except that maybe it would be a contrast to most of the other new engines. But I guess that for Leoriks playing strength NNUE would be the obvious choice.
I think it's more than just a lazy way of climbing the Elo ladder. I really like the idea that a multi-layered net can see how the pieces are in relation to each other and doesn't rely on me to spell out the interesting features for it. It isn't a heavily discussed topic here but I imagine there's just as much room for creativity when it comes to generating, labeling and filtering data or coming up with network architectures as in HCE.
emadsen wrote: Wed Jan 04, 2023 1:17 am Regarding your question about hand-crafted versus neural net evaluation, that's really up to you. Hand-crafted enables you to understand why your engine prefers one position over another. Indeed, you can add a non-standard UCI command that displays the static score of a given position, broken down by contributing terms (material, passed pawns, piece location, piece mobility, etc). Neural nets are a black box, but likely stronger.
I have a "eval" command in Leorik too and you're right that this transparency of how the engine actually comes to it's conclusion is something I would hate to give up. But it could be interesting to try and visualize the reasoning of the neural net to make it more intuitive for humans. Something like https://arxiv.org/pdf/2111.09259.pdf#page=21 but because the networks can be much simpler for NNUE maybe it's actually practical.
emadsen wrote: Wed Jan 04, 2023 1:17 am Do whatever aligns with your preferences and goals for your project.
I'm not considering NNUE for the Elo gain alone but because it's a great learning opportunity for me. Haven't done much with machine learning yet. But I'd like to stay course in so far that I'm trying to do it from scratch as much as possible. I want to start small and simple, compute my own training data and for that I'll use Leorik of course. And the probability of success increases the stronger classic Leorik is.

So I've decided to work on at least a version 2.4 first :)
Mike Sherwin wrote: Mon Jan 02, 2023 12:01 am As for the 15 x 15 idea, why not, how long could it take?
I fear it could actually take longer than you think. PSQTs are dumb but they are very fast, especially with incremental updates they are practically free. And countless ideas I had that improved the predictive quality of my eval on the dataset (MSE going down quite a bit) didn't translate into an actual strength gain for the engine because the performance overhead of computing these improved eval terms at runtime were negating all the potential benefit of a better eval. E.g. using node-limits instead of time-limits the engine was noticably stronger, but not under real-time conditions.
So for everything I successfully added to Leorik's original eval there's a trick to keep the performance up: For PSQTs it's incremental updates. For the pawnstructure features it's a pawn hash table. And for mobility... well... I don't update mobility terms in Qsearch, when a piece get's captured it's mobility bonus is not removed from the eval and surprisingly I get away with it.

But the king-relative bonus is a nice idea and on my list of things to try. First I want to revisit the move-generator though... performance improvements are always a net positive if I can make some there and HGM wrote some interesting posts that I have bookmarked.
Minimal Chess (simple, open source, C#) - Youtube & Github
Leorik (competitive, in active development, C#) - Github & Lichess
Mike Sherwin
Posts: 965
Joined: Fri Aug 21, 2020 1:25 am
Location: Planet Earth, Sol system
Full name: Michael J Sherwin

Re: Devlog of Leorik

Post by Mike Sherwin »

lithander wrote: Fri Jan 06, 2023 2:13 am PSQTs are dumb but they are very fast, especially with incremental updates they are practically free. And countless ideas I had that improved the predictive quality of my eval on the dataset (MSE going down quite a bit) didn't translate into an actual strength gain for the engine because the performance overhead of computing these improved eval terms at runtime were negating all the potential benefit of a better eval.
I'm not sure I understand. Modifying the PSQTs at run time the way I do it is practically free. Before the search even starts let's say there are white pawns on d3, g2, no white pawn on e4 and a white bishop on f1. Then all I do is bump up the values for a white pawn on g3 and a white bishop on g2. If there is a white pawn on d2 then a white bishop on d3 is bumped down. And the same for black. You can add one instance at a time and if it helps keep it.
Mike Sherwin
Posts: 965
Joined: Fri Aug 21, 2020 1:25 am
Location: Planet Earth, Sol system
Full name: Michael J Sherwin

Re: Devlog of Leorik

Post by Mike Sherwin »

The fans of RomiChess like it for its human like play. That was achieved by putting human understanding, the best I could, into the PSQTs before the search. Also that got Romi some elo! :D

Code: Select all

    if(!(wBlocker[sq] & bPawns) && (wFrontal[sq] & bPawns))
    {
      if(ray8p[sq] & below[FirstBit(ray8p[sq] & bPawns)] & wPawns)
      {
        wQueenTbl[sq] += 4;
        wRookTbl[sq] -= 4;
        wBishopTbl[sq] += (8 + row);
        wKnightTbl[sq] += (12 + row);
        wKingTbl[sq] += (10 + row);
      }else
      {
        wQueenTbl[sq] +=8;
        wRookTbl[sq] += 6;
        wBishopTbl[sq] += (8 + row * 2);
        wKnightTbl[sq] += (12 + row * 2);
        wKingTbl[sq] += (20 + row * 4);
      }
    }

    if(!(bBlocker[sq] & wPawns) && (bFrontal[sq] & wPawns))
    {
      if(ray8m[sq] & above[LastBit(ray8m[sq] & wPawns)] & bPawns)
      {
        bQueenTbl[sq] += 4;
        bRookTbl[sq] -= 4;
        bBishopTbl[sq] += (8 + (7 - row));
        bKnightTbl[sq] += (12 + (7 - row));
        bKingTbl[sq] += (10 + (7 - row));
      }else
      {
        bQueenTbl[sq] += 8;
        bRookTbl[sq] += 6;
        bBishopTbl[sq] += (8 + (7 - row) * 2);
        bKnightTbl[sq] += (12 + (7 - row) * 2);
        bKingTbl[sq] += (20 + (7 - row) * 4);
      }
    }

    if(row == ROW_7)
    {
      wRookTbl[sq] += 40;
      wRookTbl[sq + 8] += 30;
    }

    if(row == ROW_2)
    {
      bRookTbl[sq] += 40;
      bRookTbl[sq - 8] += 30;
    }
  }

  // Blocked Pawn Penalties
  if(wPawns & D2bit)
  {
    wBishopTbl[D3] -= 30;
    wKnightTbl[D3] -= 24;
    wRookTbl[D3] -= 16;
    wQueenTbl[D3] -= 12;
  }

  if(wPawns & E2bit)
  {
    wBishopTbl[E3] -= 30;
    wKnightTbl[E3] -= 24;
    wRookTbl[E3] -= 16;
    wQueenTbl[E3] -= 12;
  }

  if(bPawns & D7bit)
  {
    bBishopTbl[D6] -= 30;
    bKnightTbl[D6] -= 24;
    bRookTbl[D6] -= 16;
    bQueenTbl[D6] -= 12;
  }

  if(bPawns & E7bit)
  {
    bBishopTbl[E6] -= 30;
    bKnightTbl[E6] -= 24;
    bRookTbl[E6] -= 16;
    bQueenTbl[E6] -= 12;
  }

  // Keep Pawns at Home if King is on same side
  if(bMat > 4600) {
    if(wKings & (E1bit | F1bit | G1bit | H1bit)) {
      wPawnTbl[F2] = 20;
      wPawnTbl[G2] = 20;
      wPawnTbl[H2] = 20;
      wPawnTbl[F3] = 16;
      wPawnTbl[G3] = 16;
      wPawnTbl[H3] = 16; }

    if(wKings & (E1bit | D1bit | C1bit | B1bit | A1bit)) {
      wPawnTbl[C2] = 20;
      wPawnTbl[B2] = 20;
      wPawnTbl[A2] = 20;
      wPawnTbl[C3] = 16;
      wPawnTbl[B3] = 16;
      wPawnTbl[A3] = 16; } }

  if(wMat > 4600) {
    if(bKings & (E8bit | F8bit | G8bit | H8bit)) {
      bPawnTbl[F7] = 20;
      bPawnTbl[G7] = 20;
      bPawnTbl[H7] = 20;
      bPawnTbl[F6] = 16;
      bPawnTbl[G6] = 16;
      bPawnTbl[H6] = 16; }

    if(bKings & (E8bit | D8bit | C8bit | B8bit | A8bit)) {
      bPawnTbl[C7] = 10;
      bPawnTbl[B7] = 20;
      bPawnTbl[A7] = 20;
      bPawnTbl[C6] = 16;
      bPawnTbl[B6] = 16;
      bPawnTbl[A6] = 16; } }

  // Penalty if piece blocks bishop 
  if(wBishops & F1bit) {
    wQueenTbl[E2] -= 40;
    wQueenTbl[D3] -= 36;
    wRookTbl[E2] -= 30;
    wRookTbl[D3] -= 24;
    if(wPawns & G2bit)   {
      wKnightTbl[E2] -= 20;
      if(wPawns & (E2bit | D3bit)) {
        wPawnTbl[G3] += 20;
        wBishopTbl[G2] += 20;
        wPawnTbl[E2] -= 10; } } }

  if(wBishops & C1bit) {
    wQueenTbl[D2] -= 40;
    wQueenTbl[E3] -= 36;
    wRookTbl[D2] -= 30;
    wRookTbl[E3] -= 24;
    if(wPawns & B2bit)   {
      wKnightTbl[D2] -= 10;
      if(wPawns & (D2bit | E3bit)) {
        wPawnTbl[B3] += 20;
        wBishopTbl[B2] += 20;
        wPawnTbl[D2] -= 10; } } }

  if(bBishops & F8bit) {
    bQueenTbl[E7] -= 40;
    bQueenTbl[D6] -= 36;
    bRookTbl[E7] -= 30;
    bRookTbl[D6] -= 24;
    if(bPawns & G7bit)   {
      bKnightTbl[E7] -= 20;
      if(bPawns & (E7bit | D6bit)) {
        bPawnTbl[G6] += 20;
        bBishopTbl[G7] += 20;
        bPawnTbl[E7] -= 10; } } }

  if(bBishops & C8bit) {
    bQueenTbl[D7] -= 40;
    bQueenTbl[E6] -= 36;
    bRookTbl[D7] -= 30;
    bRookTbl[E6] -= 24;
    if(bPawns & B7bit)   {
      bKnightTbl[D7] -= 10;
      if(bPawns & (D7bit | E6bit)) {
        bPawnTbl[B6] += 20;
        bBishopTbl[B7] += 20;
        bPawnTbl[D7] -= 10; } } }

  // Penalty for not Castleing
  wKingTbl[H1] = wKingTbl[G1];
  if(wIndexs & WCASKbit) {
    wKingTbl[F1] = wKingTbl[E1] - 60;
    wKingTbl[G1] -= 60;
    wKingTbl[D1] = wKingTbl[E1] - 60;
    wKingTbl[C1] -= 60;
    wKingTbl[B1] -= 60; }
  wKingTbl[A1] = wKingTbl[B1];

  bKingTbl[H8] = bKingTbl[G8];
  if(bIndexs & BCASKbit) {
    bKingTbl[F8] = bKingTbl[E8] - 60;
    bKingTbl[G8] -= 60; 
    bKingTbl[D8] = bKingTbl[E8] - 60;
    bKingTbl[C8] -= 60;
    bKingTbl[B8] -= 60; }
  bKingTbl[A8] = bKingTbl[B8];

  // Penalty for not moving center pawns
  if(wPawns & (E2bit | D2bit)) {
    wPawnTbl[E2] -= 16;
    wPawnTbl[E3] -= 8;
    wPawnTbl[D2] -= 16;
    wPawnTbl[D3] -= 8; 
    if(bPawns & E5bit)
      wPawnTbl[E4] += 30;
    if(bPawns & D5bit)
      wPawnTbl[D4] += 30; }

  if(bPawns & (E7bit | D7bit)) {
    bPawnTbl[E7] -= 16;
    bPawnTbl[E6] -= 8;
    bPawnTbl[D7] -= 16;
    bPawnTbl[D6] -= 8;
    if(wPawns & E4bit)
      bPawnTbl[E5] += 30;
    if(wPawns & D4bit)
      bPawnTbl[D5] += 30; }

  // Encourage development of minor Pieces
  if(ply < 20) { 
    wKnightTbl[G1] -= 24;
    wKnightTbl[B1] -= 24;
    wBishopTbl[F1] -= 20;
    wBishopTbl[C1] -= 20;
    bKnightTbl[G8] -= 24;
    bKnightTbl[B8] -= 24;
    bBishopTbl[F8] -= 20;
    bBishopTbl[C8] -= 20; }

  // Encourage C2 and C7 Pawns to move
  if((wPawns & C2bit) && !(wKings & (D1bit | C1bit | B1bit | A1bit))) {
    if(wPawns & (E2bit | E3bit)) {
      wKnightTbl[C3] -= 20;
      wKnightTbl[D2] += 10;
      wPawnTbl[D2] -= 20;
      wPawnTbl[D3] -= 20;
      wPawnTbl[C2] -= 10;
      wPawnTbl[C3] += 10;
      wPawnTbl[C4] += 40; }
    else
    if((wPawns & D4bit) && (bPawns & (D5bit | D6bit | D7bit))) {
      wKnightTbl[C3] -= 20; 
      if(wPawns & (B2bit | B3bit)) {
        wPawnTbl[B3] += 20;
        wPawnTbl[C3] += 10;
        wPawnTbl[C4] += 40; }
      else {
        wPawnTbl[C2] -= 10;
        wPawnTbl[C3] += 30; } }
    else
    if((wPawns & E4bit) && (bPawns & (E5bit | C5bit)) &&
       (wPawns & (D2bit | D3bit))) {
      wPawnTbl[C3] += 30;
      wPawnTbl[D4] += 20;
      wKnightTbl[C3] -= 20;
      wKnightTbl[D2] += 20; } } 

  if((bPawns & C7bit) && !(bKings & (D8bit | C8bit | B8bit | A8bit))) {
    if(bPawns & (E7bit | E6bit)) {
      bKnightTbl[C6] -= 20;
      bKnightTbl[D7] += 20;
      bPawnTbl[D7] -= 20;
      bPawnTbl[D6] -= 20;
      bPawnTbl[C7] -= 6;
      bPawnTbl[C6] += 6;
      bPawnTbl[C5] += 40; }
    else
    if((bPawns & D5bit) && (wPawns & (D4bit | D3bit | D2bit))) {
      bKnightTbl[C6] -= 20;
      bKnightTbl[D7] += 10;
      if(bPawns & (B7bit | B6bit)) {
        bPawnTbl[B6] += 20;
        bPawnTbl[C6] += 10;
        bPawnTbl[C5] += 40; }
      else {
        bPawnTbl[C7] -= 10;
        bPawnTbl[C6] += 20; } }
    else
    if((bPawns & E5bit) && (wPawns & (E4bit | C4bit)) &&
       (bPawns & (D7bit | D6bit))) {
      bPawnTbl[C6] += 20;
      bPawnTbl[D5] += 20;
      bKnightTbl[C6] -= 20;
      bKnightTbl[D7] += 20; } } 
  
  // Discourage Pawn Moves from row 4 to row 5 releasing tension
  wPawnTbl[C5] = wPawnTbl[C4] - 24;
  wPawnTbl[D5] = wPawnTbl[D4] - 20;
  wPawnTbl[E5] = wPawnTbl[E4] - 20;
  bPawnTbl[C4] = bPawnTbl[C5] - 24;
  bPawnTbl[D4] = bPawnTbl[D5] - 20;
  bPawnTbl[E4] = bPawnTbl[E5] - 20;

  // Discourage King Moves into corner when not needed
  wKingTbl[A1] = wKingTbl[B1] - 10;
  wKingTbl[H1] = wKingTbl[G1] - 10;
  bKingTbl[A8] = bKingTbl[B8] - 10;
  bKingTbl[H8] = bKingTbl[G8] - 10;
User avatar
algerbrex
Posts: 608
Joined: Sun May 30, 2021 5:03 am
Location: United States
Full name: Christian Dean

Re: Devlog of Leorik - *New* Version 2.3

Post by algerbrex »

lithander wrote: Thu Dec 22, 2022 1:28 am I finally released Version 2.3!

Here is a gauntlet I ran with a few engines of similar strength. All Elo values except Leorik's are fixed.

Code: Select all

   # PLAYER           :  RATING  POINTS  PLAYED   (%)
   1 Inanis-1.1.1     :  2767.0   323.5     620    52
   2 odonata-0.6.2    :  2744.0   298.5     618    48
   3 Leorik-2.3       :  2741.3  1960.5    3716    53
   4 zahak-5.0        :  2730.0   295.5     620    48
   5 dumb-1.9         :  2703.0   325.0     620    52
   6 blunder-8.5.5    :  2700.0   255.0     620    41
   7 Supernova-2.4    :  2687.0   258.0     618    42
The EAS tool shows some favorable stats for Leorik. Short games, a healthy amount of sacrifices and only few bad draws.

Code: Select all

Rank  EAS-Score  sacs   shorts  draws  moves  Engine/player 
-------------------------------------------------------------------
   1    132596  15.02%  22.27%  13.46%   68   Leorik-2.3  
   2     57194  11.00%  04.00%  18.79%   85   zahak-5.0  
   3     52975  06.99%  05.38%  17.33%   83   odonata-0.6.2  
   4     50228  09.28%  13.50%  27.03%   76   dumb-1.9  
   5     45085  06.67%  10.00%  24.06%   84   Supernova-2.4  
   6     44044  04.19%  10.23%  22.95%   78   Inanis-1.1.1  
   7     34561  06.85%  09.59%  26.24%   81   blunder-8.5.5  
If you have not read my previous posts the small strength increase over version 2.2 may be a disappointment.

But the goal of this new version was to rebuild the evaluation from scratch, no longer relying on any 3rd party dataset or engine for labeling. I purged all knowledge borrowed from Zurichess and Stockfish, and that there still is a strength increase at all is more than I expected!

Also this bodes well for the future: Being able to train my weights on selfplay games means I can effortlessly create larger datasets so that in future versions the evaluation can be extended to pick up on rarer and rarer features. (e.g. in the way that Mike suggested)

For the human players I want to encourage you to use the new UCI options Midgame Randomness and Endgame Randomness that force the engine to assign a random cp bonus to each root move while retaining it's usual speed and search depth. I originally added it for data generation but it's also a really nice way to adjust the engines difficulty level in a way that feels somewhat natural. (to me^^)
Congratulations! I'm sure it feels great to have a more pure engine finally and the rating jump for Leorik 2.3 in the CCRL of over 50 Elo is very impressive. I think you've inspiried me to get back into chess programming soon, I can't let Leorik get to far ahead ;-)