Devlog of Leorik

Discussion of chess software programming and technical issues.

Moderator: Ras

User avatar
algerbrex
Posts: 608
Joined: Sun May 30, 2021 5:03 am
Location: United States
Full name: Christian Dean

Re: Devlog of Leorik

Post by algerbrex »

I just finished playing a very fun game between Leorik (white) and Blunder (black) since I hadn't actually seen the two engines play a game myself before. Both engines seemed to be playing pretty well for most of the game, until move 47.



I was curious why 47. Ne5 was such a bad blunder, and did a little analysis; it appears it actually seems to lead to Zugzwang for white. Something I had never personally seen before in a game, engine, or human:


[fen]8/8/4k3/p3Pp2/5K2/1P6/8/8 w - - 3 50[/fen]


Taking the position and having Leorik analyze it after the game shows it clearly can spot the Zugzwang after only a couple of plies of searching, so I'm not posting this as some sort of bug report, since it was a Blunder likely due to bullet time-control I chose. Just thought it was an interesting example of null-move pruning probably going wrong at lower depths :)
User avatar
lithander
Posts: 915
Joined: Sun Dec 27, 2020 2:40 am
Location: Bremen, Germany
Full name: Thomas Jahn

Re: Devlog of Leorik

Post by lithander »

algerbrex wrote: Tue May 31, 2022 7:37 pm I was curious why 47. Ne5 was such a bad blunder, and did a little analysis; it appears it actually seems to lead to Zugzwang for white. Something I had never personally seen before in a game, engine, or human:
Uh that's a serious blunder. With Leorik 1.0 (that does no unsafe prunings) the PV switches from f3e5 to the best move e3d3 at depth 5 already. For Leorik 2.1 it takes until depth 16 for the PV to switch to e3d3. But reaching depth 16 takes only 80ms on my machine... How fast were your time control settings exactly?

I hope there's nothing more serious going on here than just a lack of processing time. I might run a few fast games through an analysis engine to identify blunders like that and see how long it takes for the PV to switch to the correct line. Could lead to a valuable set of positions for comparing different versions of the engine. Time to depth measures just the performance but time-to-bestmove could be a useful metric to approximate playing-strength.
Minimal Chess (simple, open source, C#) - Youtube & Github
Leorik (competitive, in active development, C#) - Github & Lichess
User avatar
algerbrex
Posts: 608
Joined: Sun May 30, 2021 5:03 am
Location: United States
Full name: Christian Dean

Re: Devlog of Leorik

Post by algerbrex »

lithander wrote: Wed Jun 01, 2022 5:14 pm Uh that's a serious blunder. With Leorik 1.0 (that does no unsafe prunings) the PV switches from f3e5 to the best move e3d3 at depth 5 already. For Leorik 2.1 it takes until depth 16 for the PV to switch to e3d3. But reaching depth 16 takes only 80ms on my machine... How fast were your time control settings exactly?
Hmm, for me Leorik realizes it's mistake at depth 7:

Code: Select all

go
info depth 1 score cp 3 nodes 22 nps 2200 time 10 pv f3e5
info depth 2 score cp -1 nodes 60 nps 2142 time 28 pv f3e5 c5d5
info depth 3 score cp -15 nodes 258 nps 8322 time 31 pv f3e5 c5b4 e5d7
info depth 4 score cp -24 nodes 624 nps 18909 time 33 pv f3e5 c5d5 e5d3 f6d4
info depth 5 score cp -27 nodes 1517 nps 42138 time 36 pv f3e5 c5b4 e5c4 b4b3 c4a5
info depth 6 score cp -27 nodes 1895 nps 49868 time 38 pv f3e5 c5b4 e5c4 b4b3 c4a5 b3b4
info depth 7 score cp -19 nodes 5350 nps 127380 time 42 pv e3d3 c5b4 d3c2 f6e7 f3e5 e7d6 e5d3
info depth 8 score cp -19 nodes 7779 nps 176795 time 44 pv e3d3 c5b4 d3c2 f6e7 f3e5 e7d6 e5d3 b4a3
info depth 9 score cp -20 nodes 11110 nps 236382 time 47 pv e3d3 c5d5 f3d2 f6e7 d2c4 e7b4 d3e3 b4c5 e3d3
info depth 10 score cp -18 nodes 17116 nps 322943 time 53 pv e3d3 c5d5 f3d2 f6e7 d2c4 e7b4 d3e3 b4c5 e3d3 c5b4
info depth 11 score cp -16 nodes 26184 nps 422322 time 62 pv e3d3 c5d5 f3d2 f6e7 d2c4 e7b4 d3e3 d5e6 c4e5 b4c5 e3d3
info depth 12 score cp -17 nodes 46186 nps 584632 time 79 pv e3d3 c5b4 d3c2 f6g7 f3e5 b4c5 e5c4 a5a4 c4e3 a4b3 c2b3 g7h6
info depth 13 score cp -18 nodes 57060 nps 641123 time 89 pv e3d3 c5b4 d3c2 f6g7 f3e5 b4c5 e5c4 a5a4 c4e3 a4b3 c2b3 g7h6 b3c3
info depth 14 score cp -18 nodes 68630 nps 700306 time 98 pv e3d3 c5b4 d3c2 f6g7 f3e5 b4c5 e5c4 a5a4 c4e3 a4b3 c2b3 g7h6 b3c3 h6f4
info depth 15 score cp -18 nodes 110897 nps 866382 time 128 pv e3d3 c5b4 d3c2 f6g7 f3e5 b4c5 e5c4 a5a4 c4e3 a4b3 c2b3 g7h6 b3c3 h6f4 e3f5
info depth 16 score cp -16 nodes 199429 nps 1159470 time 172 pv e3d3 c5b4 d3c2 f6g7 f3h4 g7h6 h4f5 h6f4 f5e7 b4c5 e7f5 c5d5 c2d3 f4e5 d3e3 d5e6
I do see now I was using Leorik 2.0.2, so I'm not sure how much of a difference that makes? The time control I used for the game was 40 moves in 2 minutes, which I typically use when pitting Blunder against engines to get a feel for their style.
lithander wrote: Wed Jun 01, 2022 5:14 pm I hope there's nothing more serious going on here than just a lack of processing time.
I wish I would've paid more attention now, so this may just be hindsight bias, but I believe Leorik's blunder may have occurred towards the end of the 2 minutes, so it was in a bit of a time crunch perhaps? I suppose the place to start might be to investigate a bit more into Leorik's time management, which seemed to work well overall.
lithander wrote: Wed Jun 01, 2022 5:14 pm I hope there's nothing more serious going on here than just a lack of processing time. I might run a few fast games through an analysis engine to identify blunders like that and see how long it takes for the PV to switch to the correct line. Could lead to a valuable set of positions for comparing different versions of the engine. Time to depth measures just the performance but time-to-bestmove could be a useful metric to approximate playing-strength.
Regardless, that sounds like a good plan of action. Let me know how that goes!
User avatar
lithander
Posts: 915
Joined: Sun Dec 27, 2020 2:40 am
Location: Bremen, Germany
Full name: Thomas Jahn

Re: Devlog of Leorik

Post by lithander »

algerbrex wrote: Wed Jun 01, 2022 6:15 pm I do see now I was using Leorik 2.0.2, so I'm not sure how much of a difference that makes? The time control I used for the game was 40 moves in 2 minutes, which I typically use when pitting Blunder against engines to get a feel for their style.
[...]
I wish I would've paid more attention now, so this may just be hindsight bias, but I believe Leorik's blunder may have occurred towards the end of the 2 minutes, so it was in a bit of a time crunch perhaps?
2 minutes per 40 moves is 3 seconds per move on average right? And when the blunder happened in move 47 the clock was just refreshed 7 moves ago so there should have been no time pressure. When you play the move f3e5 the followup is f6e5 f4e5 c5d5 <something> d5e6 until the pawn is lost for good and the position should now be evaluated around 100cp for white. It get's only worse from there and on depth ~30 even the promoted queen should appear on the radar. With quiescence search the pawn-loss should be detectable as early as depth 5! And Leorik 1.0 (which does no risky prunings) indeed finds it at depth 5!

Leorik 2.x does null-move and all kind of prunings but it also is fast enough to reach depth 30 on such a simple position within just a second. I think there's no excuse and I need to go hunt for a bug... :/
Minimal Chess (simple, open source, C#) - Youtube & Github
Leorik (competitive, in active development, C#) - Github & Lichess
User avatar
algerbrex
Posts: 608
Joined: Sun May 30, 2021 5:03 am
Location: United States
Full name: Christian Dean

Re: Devlog of Leorik

Post by algerbrex »

lithander wrote: Wed Jun 01, 2022 7:16 pm 2 minutes per 40 moves is 3 seconds per move on average right? And when the blunder happened in move 47 the clock was just refreshed 7 moves ago so there should have been no time pressure.
Ah, that's silly of me, of course, there was no time pressure at move 47 :oops:
lithander wrote: Wed Jun 01, 2022 7:16 pm Leorik 2.x does null-move and all kind of prunings but it also is fast enough to reach depth 30 on such a simple position within a second. I think there's no excuse and I need to go hunt for a bug... :/
I might take a look through your code as well, as this has made me curious too. Given everything as you said I can't think of a good reason why Leorik didn't see the right move in the game.

If it helps at all (probably not), the game was played from the start position as normal, with no opening used. So maybe it's reproducible to a degree?
Mike Sherwin
Posts: 965
Joined: Fri Aug 21, 2020 1:25 am
Location: Planet Earth, Sol system
Full name: Michael J Sherwin

Re: Devlog of Leorik - *New* Version 2.1

Post by Mike Sherwin »

lithander wrote: Tue May 31, 2022 2:06 am I've just released a "minor" new version that adds a pawn structure term to the evaluation.
https://github.com/lithander/Leorik/releases/tag/2.1

The pawn structure evaluation (including pawn hash table) turned out surprisingly simple yet effective! Or at least it feels like it's working well... I'm always struggling to judge when a feature is done enough that it's time to move on. I wish there was a an easier way to asses how much the current implementation exhausts the theoretical potential. So (@all) what's your experience with pawn structure eval terms? How much Elo did you gain from adding it in your engines?
Mike Sherwin wrote: Sun Apr 17, 2022 11:20 pm This is probably the last version of Leorik that I'll be able to win against.
Leorik 2.1 is only about 50 Elo stronger so maybe you can still win against it? ;)
Big improvement in playing style! Much more human like. Needs pawn storm code. This was a very interesting game!! :)
dangi12012
Posts: 1062
Joined: Tue Apr 28, 2020 10:03 pm
Full name: Daniel Infuehr

Re: Devlog of Leorik

Post by dangi12012 »

lithander wrote: Wed Jun 01, 2022 7:16 pm
Hey do you have a system setup that grades your commits to get the relative elo change yet?
So you can know exactly know how much stronger or weaker your engine is.

There are many tinkered solutions out there - but I think there is no open source yaml file that works for any git repo natively.
https://docs.github.com/en/actions/host ... ed-runners

If you are interested we can set this up for your repo!
The goal can be a normal workflow of commit etc. and asynchrounously your CI pipeline will message the strength change data for that commit.
Worlds-fastest-Bitboard-Chess-Movegenerator
Daniel Inführ - Software Developer
User avatar
lithander
Posts: 915
Joined: Sun Dec 27, 2020 2:40 am
Location: Bremen, Germany
Full name: Thomas Jahn

Re: Devlog of Leorik

Post by lithander »

dangi12012 wrote: Thu Jun 02, 2022 12:22 am Hey do you have a system setup that grades your commits to get the relative elo change yet?
So you can know exactly know how much stronger or weaker your engine is.

There are many tinkered solutions out there - but I think there is no open source yaml file that works for any git repo natively.
https://docs.github.com/en/actions/host ... ed-runners

If you are interested we can set this up for your repo!
The goal can be a normal workflow of commit etc. and asynchrounously your CI pipeline will message the strength change data for that commit.
Do you mean something like fishtest or openbench?

I always assumed my engine wouldn't be complient with openbench's way of building engines from source because it needs the .Net toolchain. And to set up something like that myself I lack the dedicated hardware that would just wait for these kind of tasks and supply a result in no time. So I'm setting the tests up on my personal computer when I'm not working or gaming at the moment.

Falsifying a small patch with the expected gain of just a few Elo takes a lot of compute, sadly.
Minimal Chess (simple, open source, C#) - Youtube & Github
Leorik (competitive, in active development, C#) - Github & Lichess
User avatar
lithander
Posts: 915
Joined: Sun Dec 27, 2020 2:40 am
Location: Bremen, Germany
Full name: Thomas Jahn

Re: Devlog of Leorik - *New* Version 2.1

Post by lithander »

Mike Sherwin wrote: Wed Jun 01, 2022 10:27 pm Big improvement in playing style! Much more human like. Needs pawn storm code. This was a very interesting game!! :)
Thanks for playing the new version and glad to hear I'm making some progress in the direction of style! :)

A pawn storm.... is that what you did on the King Side? Moving a phalanx of pawns together? Leorik should have countered that better, you mean?
Minimal Chess (simple, open source, C#) - Youtube & Github
Leorik (competitive, in active development, C#) - Github & Lichess
dangi12012
Posts: 1062
Joined: Tue Apr 28, 2020 10:03 pm
Full name: Daniel Infuehr

Re: Devlog of Leorik

Post by dangi12012 »

lithander wrote: Thu Jun 02, 2022 9:41 am Do you mean something like fishtest or openbench?

I always assumed my engine wouldn't be complient with openbench's way of building engines from source because it needs the .Net toolchain. And to set up something like that myself I lack the dedicated hardware that would just wait for these kind of tasks and supply a result in no time. So I'm setting the tests up on my personal computer when I'm not working or gaming at the moment.
Much simpler. Yaml files provide an easy easy way to set that up. You just checkin a single file and git will start to execute you continuus integration steps against workers. These workers can be self hosted and I even have some spare machines.

If its enough to get a small standard deviation of elo per run remains to be seen - but running a tournament against itself in an 8x8 grid with 4x master and 4x commit should eliminate noise.

There has to be math already done somewhere that will give a mathemtical sound confidence interval for tournament results.
Worlds-fastest-Bitboard-Chess-Movegenerator
Daniel Inführ - Software Developer