Devlog of Leorik

mvanthoor · Post by **mvanthoor** » Sun Feb 06, 2022 10:58 am

lithander wrote: ↑Sat Feb 05, 2022 11:22 pm ...

I've run my own test during the night at the 10s+0.1s I normally use. It seems Rustic 4 and Leorik are now in the same ballpark:

Score of Rustic Alpha 3.43.100 vs Leorik 0.2.3: 715 - 778 - 507 [0.484]
... Rustic Alpha 3.43.100 playing White: 396 - 360 - 244 [0.518] 1000
... Rustic Alpha 3.43.100 playing Black: 319 - 418 - 263 [0.451] 1000
... White vs Black: 814 - 679 - 507 [0.534] 2000
Elo difference: -10.9 +/- 13.1, LOS: 5.2 %, DrawRatio: 25.4 %
2000 of 2000 games finished.

With regard to speed, they also seem to be similar, but I'm seeing a few things I can't explain. See the time-to-depth test below:

Rustic:

position fen r3k2r/p1ppqpb1/bn2pnp1/3PN3/1p2P3/2N2Q1p/PPPBBPPP/R3K2R w KQkq - 0 1
go depth 10
info score cp 72 depth 1 seldepth 11 time 0 nodes 1032 nps 0 pv e2a6 b4c3
info score cp 72 depth 2 seldepth 11 time 0 nodes 2420 nps 0 pv e2a6 b4c3
info score cp 44 depth 3 seldepth 17 time 1 nodes 10792 nps 10792000 pv d5e6 e7e6 e2a6 e6e5
info score cp 44 depth 4 seldepth 19 time 5 nodes 37884 nps 7576800 pv d5e6 e7e6 e2a6 e6e5
info score cp 20 depth 5 seldepth 19 time 23 nodes 185680 nps 8073043 hashfull 2 pv e2a6 e6d5 c3d5 b6d5 e4d5 e7e5
info score cp 2 depth 6 seldepth 20 time 98 nodes 645425 nps 6585969 hashfull 15 pv d5e6 e7e6 e2a6 h3g2 f3g2 e6e5
info score cp -9 depth 7 seldepth 22 time 486 nodes 3139224 nps 6459309 hashfull 71 pv e2a6 e6d5 c3d5 f6d5 e5d3 f7f5 e1g1 f5e4
info score cp -39 depth 8 seldepth 23 time 1822 nodes 10474182 nps 5748728 hashfull 279 pv e2a6 e6d5 e1g1 h3g2 f3g2 e7e5 c3d5 f6d5 e4d5 e5d5 d2b4 d5g2 g1g2 g7b2
info score cp -26 depth 9 seldepth 25 time 5927 nodes 34466920 nps 5815239 hashfull 716 pv e2a6 e6d5 c3d5 b6d5 a6b7 h3g2 f3g2 a8b8 e4d5 b8b7
info score cp -37 depth 10 seldepth 27 time 24585 nodes 135415089 nps 5508037 hashfull 999 pv e2a6 e6d5 c3b5 d5e4 b5c7 e8d8 f3g3 h3g2 h1g1 f6h5 d2b4 h5g3 b4e7 d8c7 h2g3 g7e5 g1g2 e5b2
bestmove e2a6
quit

Leorik:

position fen r3k2r/p1ppqpb1/bn2pnp1/3PN3/1p2P3/2N2Q1p/PPPBBPPP/R3K2R w KQkq - 0 1
go depth 10
info string Search scheduled to take 715827857ms!
info depth 1 score cp 39 nodes 8370 nps 139500 time 60 pv d5e6
info depth 2 score cp 39 nodes 12941 nps 142208 time 91 pv d5e6 e7e6
info depth 3 score cp -1 nodes 25405 nps 267421 time 95 pv d5e6 e7e6 e2a6
info depth 4 score cp -1 nodes 57046 nps 538169 time 106 pv d5e6 e7e6 e2a6 e6e5
info depth 5 score cp -31 nodes 299820 nps 1774082 time 169 pv e2a6 b4c3 d2c3 e6d5 e5g4
info depth 6 score cp -31 nodes 848063 nps 2546735 time 333 pv e2a6 b4c3 d2c3 e6d5 e5g4 h3g2
info depth 7 score cp -46 nodes 4096299 nps 5520618 time 742 pv d5e6 e7e6 e2a6 h3g2 f3g2 e6e5 c3b5
info depth 8 score cp -47 nodes 15748792 nps 6483652 time 2429 pv e2a6 e6d5 c3b5 h3g2 b5c7 e8d8
info depth 9 score cp -34 nodes 45256103 nps 7416601 time 6102 pv e2a6 e6d5 c3b5 h3g2 b5c7 e8d8
info depth 10 score cp -47 nodes 159081047 nps 6210949 time 25613 pv e2a6 e6d5
bestmove e2a6
quit

Some observations:
- It seems Rustic is faster with regard to time-to-depth.
- It looks like this is caused by the fact that Rustic searches less nodes.
- Still, Leorik often outcalculates Rustic by 1-2 play in the late middle-game, and sometimes even 3 ply in the end-game
- If an engine searches 1-3 ply deeper than another engine this should give you MUCH more strength than +10 Elo. I'll try to run a test with depth settings of 6 and 8 or something like that with Rustic, and see what the difference is. (The higher the depth, the less the Elo difference will be, obviously.)
- What's up with Leorik's strange NPS that needs to "rev up" first? It starts out really strange and at some point exceeds Rustic's nps, which seems to stabilize at 5.5M nodes/sec at higher depths. (Longer time to search, so the average per position is more reliable.)
- What abouv Leorik's super-short PV?

You did help me to discover a massive bug though. Refactoring the printing routines (adding the Display trait in many places) left me with an extra space after "pv". I'll have to fix that ASAP. Printing that extra space costs time you know.

Does Leorik 0.2.3 which you have sent me have the staged move generation that you posted earlier in the thread?

lithander · Post by **lithander** » Sun Feb 06, 2022 10:48 pm

mvanthoor wrote: ↑Sun Feb 06, 2022 10:58 am Some observations:
- It seems Rustic is faster with regard to time-to-depth.
- It looks like this is caused by the fact that Rustic searches less nodes.

Does not surprise me. You have PVS and Killer moves which should allow you to search less nodes than Leorik to reach the same depth.

mvanthoor wrote: ↑Sun Feb 06, 2022 10:58 am - Still, Leorik often outcalculates Rustic by 1-2 play in the late middle-game, and sometimes even 3 ply in the end-game
- If an engine searches 1-3 ply deeper than another engine this should give you MUCH more strength than +10 Elo. I'll try to run a test with depth settings of 6 and 8 or something like that with Rustic, and see what the difference is. (The higher the depth, the less the Elo difference will be, obviously.)

If your engine generally reaches a depth a little faster then Leroik (as you said above) and then Leorik has a depth advantage in the endgame it might be that the game is already decided. Maybe it's just too late to capitalize from that advantage. Frankly, you have more features (Killer & PVS) and that there's even +10 Elo in favor for Leoric comes at a surprise to me.

mvanthoor wrote: ↑Sun Feb 06, 2022 10:58 am - What's up with Leorik's strange NPS that needs to "rev up" first? It starts out really strange and at some point exceeds Rustic's nps, which seems to stabilize at 5.5M nodes/sec at higher depths. (Longer time to search, so the average per position is more reliable.)

Good catch. Rustics first iterations take a 0ms, 0ms, 1ms, 4ms, 18ms, 75ms. And you reach depth 6 in ~100ms. Leorik reports the iterations to take 60ms, 31ms, 4ms, 10ms, 63ms... so there seems to be a huge overhead slowing down the first two iterations. It's probably the JIT compiler of the .NET runtime. If you type "ucinewgame" to clear the TT and then go again you get much more reasonable results for Leorik too:

Code: Select all

position fen r3k2r/p1ppqpb1/bn2pnp1/3PN3/1p2P3/2N2Q1p/PPPBBPPP/R3K2R w KQkq - 0 1
ucinewgame
go
info string Search scheduled to take 715827857ms!
info depth 1 score cp 39 nodes 8370 nps 8370000 time 1 pv d5e6
info depth 2 score cp 39 nodes 12941 nps 6470500 time 2 pv d5e6 e7e6
info depth 3 score cp -1 nodes 25405 nps 8468333 time 3 pv d5e6 e7e6 e2a6
info depth 4 score cp -1 nodes 57046 nps 7130750 time 8 pv d5e6 e7e6 e2a6 e6e5
info depth 5 score cp -31 nodes 299820 nps 8328333 time 36 pv e2a6 b4c3 d2c3 e6d5 e5g4
info depth 6 score cp -31 nodes 848063 nps 7126579 time 119 pv e2a6 b4c3 d2c3 e6d5 e5g4 h3g2
info depth 7 score cp -46 nodes 4096299 nps 8079485 time 507 pv d5e6 e7e6 e2a6 h3g2 f3g2 e6e5 c3b5
info depth 8 score cp -47 nodes 15748792 nps 7184667 time 2192 pv e2a6 e6d5 c3b5 h3g2 b5c7 e8d8
info depth 9 score cp -34 nodes 45256103 nps 7792028 time 5808 pv e2a6 e6d5 c3b5 h3g2 b5c7 e8d8
info depth 10 score cp -47 nodes 159081047 nps 6172870 time 25771 pv e2a6 e6d5

mvanthoor wrote: ↑Sun Feb 06, 2022 10:58 am - What abouv Leorik's super-short PV?

I don't keep track of the PV yet and just really know the best move and try to restore the rest of the sequence from the TT. Sometimes it fails.

...did I mention the engine is heavy WIP?

Just kidding. I'm really grateful for your critical eye.

And I'll have to think about what I do about the first search wasting 100ms of the clock on behind-the-scenes things. If I want to use fast time controls for testing ELO improvements even such a small amount matters.

lithander · Post by **lithander** » Sun Feb 06, 2022 10:57 pm

I forgot to answer the last question and was to stupid to "edit" and made a new post instead...

mvanthoor wrote: ↑Sun Feb 06, 2022 10:58 am Does Leorik 0.2.3 which you have sent me have the staged move generation that you posted earlier in the thread?

Yes! My move generator generates captures and quiets seperately and so implementing those two stages was really a natural choice and I did it right from the start.

mvanthoor · Post by **mvanthoor** » Mon Feb 07, 2022 1:02 am

lithander wrote: ↑Sun Feb 06, 2022 10:48 pm ...did I mention the engine is heavy WIP? Just kidding. I'm really grateful for your critical eye.

I hope it is useful. I'm just mentioning things that seem off to me, because you may not have noticed them (yet) and could be a bug or a problem that's going to be hard to track and fix later. If it is as you expected, everything's fine

I still marvel at the 8M NPS. I just can't believe, everything else being equal, that an engine in C# could be as fast as an engine in Rust. But I'm also keep forgetting that you have CPU that is a few years newer than mine. On my computer where I can test both engines side by side, Rustic seems to still be faster, but only by around 10-12%, and for some reason it also depends on the position.

lithander wrote: ↑Sun Feb 06, 2022 10:57 pm Yes! My move generator generates captures and quiets seperately and so implementing those two stages was really a natural choice and I did it right from the start.

Do you also have a stage for the TT move already? I think that this feature can compensate for Killers and PVS. Both of those features together add around 89 Elo in self-play (60 in gauntlets). These features reduce the number of nodes having to be searched, but staged move generation just simply saves on even generating the nodes, let alone cutting them out through searching. I would not be surprised if your +10 Elo comes from this feature. You could put PVS and Killers in, then disable staged move generation and test again... or just disable it now, test again, and you'll know how much Elo this feature gives you.

I'm amazed by the numbers you stated for the TT. 370 Elo in total, for TT-move ordering and TT-cuts? That's nuts. I'm seeing 180, which is more in line with other engines. (We had some threads about this a year ago.)

PS: you don't clear the board on ucinewgame? If you set up the position from the start as I do and then play the move, instead of extracting the last move and playing only that on the board, you don't even need ucinewgame; but AFAIK, that command should reset the baord. (Jon Dart said in Rustic's thread that some GUI's omit this command, which is really bad in Rustic's case, because then the TT will never be cleared on new games. It will thus be filled up with high-depth moves and lose 2 out of 3 slots. That will cost the engine some Elo in GUI's that don't send this command. Maybe I should clear the TT as soon as the game ends or something.)

lithander · Post by **lithander** » Mon Feb 07, 2022 12:54 pm

mvanthoor wrote: ↑Mon Feb 07, 2022 1:02 am I just can't believe, everything else being equal, that an engine in C# could be as fast as an engine in Rust.

Well, all else is not equal and never will be, I fear. I actually did the QBB-perft port to measure the speed potential of C# in an "all-else-being-qual" kinda test and C# reached 80% of the performance of C/C++. It was giving me faith enough to use C# again for Leorik. But Leorik is not a Rustic port - drawing conclusions about the speed of the programming language seems impossible.

mvanthoor wrote: ↑Mon Feb 07, 2022 1:02 am Do you also have a stage for the TT move already? [...] You could put PVS and Killers in, then disable staged move generation and test again... or just disable it now, test again, and you'll know how much Elo this feature gives you.

It makes no sense for me to disable the feature because it's not a feature. It's a fundamental decision like whether you use Make/Unmake or Copy/Make... how much Elo I lose by artificially making my engine slower is not something I'm really interested in, sorry. At what point are you going to *add* staged move generation to Rustic, though?

mvanthoor wrote: ↑Mon Feb 07, 2022 1:02 am I'm amazed by the numbers you stated for the TT. 370 Elo in total, for TT-move ordering and TT-cuts? That's nuts. I'm seeing 180, which is more in line with other engines. (We had some threads about this a year ago.)

Before having the TT my engine couldn't reason about repetitions. If you have other mechanisms for that in place already the TT will probably give much less Elo! That's the most likely explanation I think.

mvanthoor wrote: ↑Mon Feb 07, 2022 1:02 am PS: you don't clear the board on ucinewgame?

I just clear the TT. If the user wants to set the start position I expect him to enter "position startpos"... is that wrong? I find the spec not clear on that.
My TT is written in a way that it's never really filling up. You have two bucket options and the position you store will always be stored in either one of the two. So omitting the ucinewgame isn't really a problem. But I think it's usefull to clear the TT so you can reproduce a search from scratch without restarting the engine.

mvanthoor · Post by **mvanthoor** » Mon Feb 07, 2022 5:10 pm

lithander wrote: ↑Mon Feb 07, 2022 12:54 pm Well, all else is not equal and never will be, I fear. I actually did the QBB-perft port to measure the speed potential of C# in an "all-else-being-qual" kinda test and C# reached 80% of the performance of C/C++. It was giving me faith enough to use C# again for Leorik. But Leorik is not a Rustic port - drawing conclusions about the speed of the programming language seems impossible.

80% of C/C++ is very decent speed. C# has sped up a lot over the years I see. It's fast enough to use where-ever you want, except systems programming and drivers.

It makes no sense for me to disable the feature because it's not a feature. It's a fundamental decision like whether you use Make/Unmake or Copy/Make... how much Elo I lose by artificially making my engine slower is not something I'm really interested in, sorry. At what point are you going to *add* staged move generation to Rustic, though?

Rustic has had move generation stages since Alpha 1 because you have to, for QSearch. Staged move generation will be in version 5 or 6. Maybe I'll put it in with Null move pruning and static null move pruning in version 5; otherwise it'll be the main feature for version 6.

Before having the TT my engine couldn't reason about repetitions. If you have other mechanisms for that in place already the TT will probably give much less Elo! That's the most likely explanation I think.

Yes; not recognizing repetitions will draw you many games you should have won, and lose you many games you could have drawn. I don't know how much Elo you'd lose with not having repetition detection. (Rustic uses the old-fashioned way of looking through the game history backwards to find repetitions.)

I just clear the TT. If the user wants to set the start position I expect him to enter "position startpos"... is that wrong? I find the spec not clear on that.

To be honest, I don't really know. I've always understood "ucinewgame" to start a completely new game, i.e. reset the engine to the state in which it is after just starting up. Therefore I clear the TT, throw away the board, and create a new one with the starting position, because that is the situation the engine is in after starting up. If the user wants something else after that, I expect to receive "position startpos moves ..." or "position fen ... moves ...".

My TT is written in a way that it's never really filling up. You have two bucket options and the position you store will always be stored in either one of the two. So omitting the ucinewgame isn't really a problem. But I think it's usefull to clear the TT so you can reproduce a search from scratch without restarting the engine.

OK; two-bucket system with the move going into either one whatever happens. At some point I'll have to take a look into my TT implementation again. I have 3 buckets now, because it is slightly faster than 4. The move is always stored in the bucket having the lowest depth; but if the other buckets are all high depth, you basically have an always-replace TT. (Same is true for your implementation if one of your buckets is a really high depth.)

lithander · Post by **lithander** » Mon Feb 07, 2022 5:39 pm

mvanthoor wrote: ↑Mon Feb 07, 2022 5:10 pm The move is always stored in the bucket having the lowest depth; but if the other buckets are all high depth, you basically have an always-replace TT. Same is true for your implementation if one of your buckets is a really high depth.

No, I use a combination of age and depth to determine the slot. Table hits reset the age to 0. But if the position is from an old game eventually it's age will overrule the depth and it will get replaced.

Code: Select all

static int Index(in ulong hash)
{
	int index = (int)(hash % (ulong)_table.Length);
	ref HashEntry e0 = ref _table[index];
	ref HashEntry e1 = ref _table[index ^ 1];

	if (e0.Hash == hash)
		return index;

	if (e1.Hash == hash)
		return index ^ 1;

	//raise age of both and choose the older, shallower entry!
	return (++e0.Age - e0.Depth) > (++e1.Age - e1.Depth) ? index : index ^ 1;
}

lithander · Post by **lithander** » Thu Feb 17, 2022 2:01 pm

lithander wrote: ↑Tue Jan 04, 2022 3:39 pm So I think I'm ready to leave the perft-only state behind and reimplement the features of MinimalChess step by step. Hopefully for version 1.0 of Leorik I'll have an engine that plays like it's predecessor but several times faster.

This would be where I plan the first public release but it's still a long way to go.

I want to announce a change of plans. There will be a first public release of Leorik in the next couple of days even though it doesn't yet play as strong as MMC because a lot of features are still missing. And here's why:

1.) Adding Killers and PVS since the last progress update was initially done in an evening. And then I lost myself in the details. I would like to make sure that the current version doesn't suffer from any major issues or bugs before I start to pile up more features. I really hope that the CCRL guys will pick it up so it can receive a ranking in the blitz list so I can see if any of the engines around it achieved their result with less features. In that case I might have a problem.

2.) Another reason is that due to my daily work on the engine I'm participating in the forum much more again. And it just sucks if you can't link the source code of your engine when talking about implementation details.

3.) Lastly I think what I currently have is a well rounded package. It includes what I think can be considered the essential features of a chess engine but not much more.
Currently there is no pruning that would change the search result from that of a plain min/max search. Every mate puzzle you give Leorik now will receive the correct result and line as the first found mate.

You might have to wait till the end of the universe though...

But I think that's quite a cool state for an engine to be in and as soon as I implement the next obvious feature which is null-move pruning that property is lost.

Algorithms & Features:

Written from scratch in C#
Board representation based on bitboards with Copy/Make
Pseudo-legal move generator (with a somewhat novel slider generation approach)
Simple evaluation based on tapered, tuned PSTs, only
Incremental updates of evaluation and zobrist hash
Two-bucket Transposition Table.
Staged move generation
MVV-LVA sorted captures
Killers
PVS search
Quiescence search
Supports the same subset of the UCI protocol as MinimalChess

I have no idea about the Elo level of the upcoming version. Can you refer me to engines that have roughly the same feature set as mine, so that I can setup a gauntlet?

eligolf · Post by **eligolf** » Thu Feb 17, 2022 2:22 pm

Very cool! I agree that the state you are in now is my favourite because you can somewhat understand why the engine is doing as it should. With other prunings such as null moves it is so hard to find bugs and understand if things get better or just breaks. Looking forward to try it out!

mvanthoor · Post by **mvanthoor** » Thu Feb 17, 2022 4:30 pm

lithander wrote: ↑Thu Feb 17, 2022 2:01 pm I want to announce a change of plans. There will be a first public release of Leorik in the next couple of days even though it doesn't yet play as strong as MMC because a lot of features are still missing. And here's why:

1.) Adding Killers and PVS since the last progress update was initially done in an evening. And then I lost myself in the details. I would like to make sure that the current version doesn't suffer from any major issues or bugs before I start to pile up more features. I really hope that the CCRL guys will pick it up so it can receive a ranking in the blitz list so I can see if any of the engines around it achieved their result with less features. In that case I might have a problem.

Good idea. That's the reason why I decided to release even the lowest of the low baseline version to see how strong I could get it to play with a minimal feature-set.

2.) Another reason is that due to my daily work on the engine I'm participating in the forum much more again. And it just sucks if you can't link the source code of your engine when talking about implementation details.

Sure is; if people are talking about "their engine" and are asking questions, I always automatically look in their sig to see if some code and/or documentation is linked there. If it isn't, there's a high chance I'm ignoring the topic because I can't comment on code that isn't available.

3.) Lastly I think what I currently have is a well rounded package. It includes what I think can be considered the essential features of a chess engine but not much more.
Currently there is no pruning that would change the search result from that of a plain min/max search. Every mate puzzle you give Leorik now will receive the correct result and line as the first found mate.

Nice. So you're basically at the point of Rustic Alpha 3 + staged move generation + your own tuned tables. If so, I'd expect the engine to score in CCRL:

2160 (+/- 30) + 40 (staged move generation) + 20 (max? better tuned tables) = ~2220 +/- 30. Take a bit of variance into account due to playing a gauntlet against 20-30 different opponents (with probably more features for most of them), so my expectation would be 2170 - 2230 for this version. I'd be surprised to see it lower or higher than this.

You might have to wait till the end of the universe though...

Hey! I'm already on that track.

I have no idea about the Elo level of the upcoming version. Can you refer me to engines that have roughly the same feature set as mine, so that I can setup a gauntlet?

See above

I'm surprised you're not slowed down by the copy/make approach instead of make/unmake. How can copying the entire board and accompanying state be as fast as just making and unmaking moves?

Devlog of Leorik

Re: Devlog of Leorik

Re: Devlog of Leorik

Re: Devlog of Leorik

Re: Devlog of Leorik

Re: Devlog of Leorik

Re: Devlog of Leorik

Re: Devlog of Leorik

Re: Devlog of Leorik

Re: Devlog of Leorik

Re: Devlog of Leorik