Devlog of Leorik

lithander · Post by **lithander** » Sun Mar 26, 2023 12:44 pm

Modern Times wrote: ↑Sun Mar 26, 2023 9:01 am It is on the CCRL Blitz list already, just made the cut-off, however judging by the average opponent Elo it is much stronger than the tester anticipated! It will require more games against stronger opponents.

2950 Elo?! That's much much stronger than all my tests indicated. The opponents were around 2800 Elo which is what the results auf the gauntlet I posted with the announcement above indicated. Also in selfplay 2.4 was only 70 Elo stronger than 2.3... I have no explanation! Let's see were it settles in the end.

Peperoni · Post by **Peperoni** » Sun Mar 26, 2023 4:40 pm

lithander wrote: ↑Sun Mar 26, 2023 12:44 pm
Modern Times wrote: ↑Sun Mar 26, 2023 9:01 am It is on the CCRL Blitz list already, just made the cut-off, however judging by the average opponent Elo it is much stronger than the tester anticipated! It will require more games against stronger opponents.
2950 Elo?! That's much much stronger than all my tests indicated. The opponents were around 2800 Elo which is what the results auf the gauntlet I posted with the announcement above indicated. Also in selfplay 2.4 was only 70 Elo stronger than 2.3... I have no explanation! Let's see were it settles in the end.

I am curious, did you measure ELO differences based on the hardware Leorik runs on?

lithander · Post by **lithander** » Sun Mar 26, 2023 8:26 pm

Peperoni wrote: ↑Sun Mar 26, 2023 4:40 pm I am curious, did you measure ELO differences based on the hardware Leorik runs on?

Well, of course the engine is less strong on a Raspberry Pi than on my Ryzen 5900X where it searches 10x more nodes per second.

But when I set up a Gauntlet to estimate Elo all engines run on the same hardware. In my case that was an Intel 9700k with 8 cores. I run 7 games in parallel and the PC wasn’t used for anything else during that time. And Leorik 2.4 did not surprise me in any way. It basically confirmed the +70 Elo my selfplay matches between 2.4 and 2.3 predicted. Also I ran 15k Games to limit the influence of pure luck. I was really pretty confident about expecting ~2800… I wonder if there’s anything wrong with my methodology.

But to be honest +230 Elo sounds way to good to be true!

I can’t wait to see where it places after more games or in the 4040 list and on the CEGT. This has suddenly become super exciting, haha

dangi12012 · Post by **dangi12012** » Sun Mar 26, 2023 11:09 pm

Its really cool to see someone pushing C#
Wonder if you get a big performance gain by updating to .net 8.0?

I see you are on net6.0 https://github.com/lithander/Leorik/blo ... ore.csproj

emadsen · Post by **emadsen** » Mon Mar 27, 2023 5:45 am

lithander wrote: ↑Sat Mar 25, 2023 4:18 pm The playing strength of Version 2.4 should be around 2800 Elo. This estimate is based on a gauntlet against a few other engines at 40/30 time control. I hope it will not do much worse on the rating lists!

lithander wrote: ↑Sun Mar 26, 2023 12:44 pm 2950 Elo?! That's much much stronger than all my tests indicated.

That sounds great... and a bit confusing. How can one gauntlet yield 2800 Elo and another 2950 Elo? Either way, great progress!

Modern Times · Post by **Modern Times** » Mon Mar 27, 2023 10:00 am

Maybe it is a good blitzer, 2+1 is quite different to 40/30 (assuming that is 40 moves in 30 minutes and not 30 seconds)

lithander · Post by **lithander** » Mon Mar 27, 2023 11:15 am

Modern Times wrote: ↑Mon Mar 27, 2023 10:00 am Maybe it is a good blitzer, 2+1 is quite different to 40/30 (assuming that is 40 moves in 30 minutes and not 30 seconds)

In cutechess-cli tc=40/30 means 40 moves in 30 seconds! So definitely blitz games. (Otherwise that 15k games gauntlet would have taken several months on my hardware^^)

Modern Times · Post by **Modern Times** » Mon Mar 27, 2023 1:00 pm

OK - I'm running some 40/15 (minutes) games with it. At that time control it is nowhere near 2950, it does seem to be around 2800 but early days.

Mike Sherwin · Post by **Mike Sherwin** » Tue Mar 28, 2023 8:51 am

Have you tried LMP yet?

Leorik.Engine - Leorik-2.4 : 12.5/20 6-1-13 (=01=1=====1==1=11===) 63% +92

Code: Select all

                //moves after the PV are searched with a null-window around alpha expecting the move to fail low
                if (remaining > 1 && playState.PlayedMoves > 1)
                {
                    // LMP
                    if (eval + 300 <= alpha && playState.PlayedMoves > 10 + remaining) break;
                    int R = 0;
                    //non-tactical late moves are searched at a reduced depth to make this test even faster!
                    if (!inCheck && playState.Stage >= Stage.Quiets && !next.InCheck())
                        R += 2;
                    //when not in check moves with a negative SEE score are reduced further
                    if (!inCheck && _see.IsBad(current, ref move))
                        R += 2;

                    if (EvaluateNext(ply, remaining - R, alpha, alpha + 1, moveGen) <= alpha)
                        continue;

                    //...but if against expectations the move does NOT fail low we research it at full depth!
                }

Not enough games, I know I know but hey that is your Job.

Mike Sherwin · Post by **Mike Sherwin** » Tue Mar 28, 2023 4:58 pm

I let it run for 80 more games. The final result was still good. There should be room for improvement.

Leorik.Engine - Leorik-2.4 : 53.5/100 25-18-57 (=01=1=====1==1=11===0010=1=1===01===01==0==110=01=11=0=1=1==000=1=1=1==1=0=0=0=======110=====1===0==) 54% +28

Devlog of Leorik

Re: Devlog of Leorik

Re: Devlog of Leorik

Re: Devlog of Leorik

Re: Devlog of Leorik

Re: Devlog of Leorik - New Version 2.4

Re: Devlog of Leorik

Re: Devlog of Leorik

Re: Devlog of Leorik

Re: Devlog of Leorik

Re: Devlog of Leorik