Devlog of Leorik

lithander · Post by **lithander** » Tue Feb 01, 2022 1:52 pm

Not really a progress update but I did an experiment that I wanted to share here.

My move-gen is pseudo legal which means that each move is technically possible to be played but some are creating a position where the moving side is in check afterwards. To detect those moves I just play each move first and then see if the king is attacked and if it is I'll just skip calling Evaluate on the resulting position. And I'm also not counting the position as a node visited even though a lot time has already been spent on it.

So half-jokingly I thought if I would count them too my nodes per second would appear much higher! You know... for bragging rights. But this gave me an idea... wouldn't it be fair to count the moves if I then also played them normally? What would happen? In Evaluation - regardless of the depth - captures would be generated for each position (Qsearch) and for an illegal position one capture is going to be targeting the King. The move ordering would pick this move to be played first. Now in the resulting position a king would be missing which is trivial to check by just logical-anding together two bitboards and if there's no bit remaining (KINGS & WHITE == 0) the side has no king. In that case I was returning the checkmate score and the whole branch would fail beta for the side capturing the king. And... everything still works!

It's a lot of extra work to play out these captures. But only after an illegal move was made, in all other cases it's saving the effort of doing the legality check. I could have imagined this to be winning bargain.

But alas, it wasn't.

Then I thought about a small optimization: Why even play the king capture? It's cheap to know that the move is targeting a king without playing it. So I was just returning the appropriate checkmate-score immediately. Now the tradeoff is, that I save the legality check on *all* moves but on some moves (that are illegal) you waste time: you play it, pass it on to the next Eval() step where you have to generate all the captures, look once at the movelist (pick the best move) and only then you realize that this position was in fact the result of an illegal move.

Code: Select all

Searching IterativeSearch(7)
Searched 300 positions with IterativeSearch(7)
3868798K nodes visited. Took 303,228 seconds!
12758K NPS.
Best move found in 255 / 300 positions!

Searching IterativeSearchNext(7)
Searched 300 positions with IterativeSearchNext(7)
6053496K nodes visited. Took 366,23 seconds!
16529K NPS.
Best move found in 254 / 300 positions!

The second result is the version playing and then recovering from illegal moves. The nps went up but other than that it's sadly not better!

Maybe for other engines it could turn out differently? I just thought it was interesting enough to share it here.

Mike Sherwin · Post by **Mike Sherwin** » Wed Feb 02, 2022 5:50 am

Too many words for me to keep in memory. The way I do this in RomiChess and Bricabrac is while generating moves all captures are accumulated in one u64. After all moves have been generated it becomes as easy as if (captures & king[1 - stm]) return ILLEGAL;. Anyway:

There is a hundred game match going on between Leo and Bric. Bric has a slight lead.

Bricabrac - Leorik.Engine : 18.5/34 13-10-11 (==10110=01==0110===1==111=01001100) 54% +28

A two piece sacrifice.

[Event "Leorik"] 
[Site "DESKTOP-HFVHK2B"] 
[Date "2022.02.01"] 
[Round "31"] 
[White "Bricabrac"] 
[Black "Leorik.Engine"] 
[Result "1-0"] 
[BlackElo "2200"] 
[ECO "C45"] 
[Opening "Scotch"] 
[Time "21:08:41"] 
[Variation "Steinitz Variation"] 
[WhiteElo "2200"] 
[TimeControl "10+4"] 
[Termination "normal"] 
[PlyCount "55"] 
[WhiteType "human"] 
[BlackType "human"] 
 
1. e4 e5 2. Nf3 Nc6 3. d4 exd4 4. Nxd4 Qh4 {(Qd8-h4) +0.21/8 3} 5. Be3 
{(c1e3 g8f6 d4c6 d7c6 b1d2 f6e4 g2g3 h4e7 f1d3 e4d2 e1d2) +0.12/11 6} Qxe4 
{(Qh4xe4) +0.48/8 4} 6. Nd2 {(b1d2 e4e5 c2c3 f8c5 d4c6 d7c6 d2c4 e5e4 c4d2 
e4e5) 0.00/10 5} Qe7 {(Qe4-e7) -0.11/8 4} 7. Nxc6 {(d4c6 d7c6 f1d3 e7b4 
e1g1 b4b2 d2c4 b2f6 d1h5 c8e6) +0.31/10 5} dxc6 {(d7xc6) +3.48/8 3} 8. Bc4 
{(f1c4 e7e5 d1b1 g8f6 e1g1 c8g4 h2h3 g4d7 h3h4 f8e7) -0.49/10 5} Be6 
{(Bc8-e6) +0.25/7 4} 9. O-O {(e1g1 e6c4 d2c4 e7e4 c4a5 e4b4 d1e1 b4e1 a1e1 
f8b4) -0.07/10 4} O-O-O {+0.25/8 2 (O-O-O)} 10. Bxe6+ {(c4e6 e7e6 e3a7 g8f6 
f1e1 e6d5 a7e3 f8d6 c2c4 d5e5) +0.11/10 4} Qxe6 {(Qe7xe6) +3.48/8 2} 11. 
Bxa7 {(e3a7 g8f6 f1e1 e6f5 a7e3 c6c5 d1b1 d8d5 d2c4 f6e4) -0.05/10 4} b6 
{(b7-b6) +1.98/8 8} 12. Re1 {(f1e1 e6h6 a7b6 h6d2 d1g4 d2d7 g4d7 d8d7 b6e3 
g8f6 e3g5) -0.99/11 4} Qg6 {(Qe6-g6) +2.00/8 2} 13. Qe2 {(d1e2 c8b7 a7b6 
b7b6 d2c4 b6b7 c4e5 g6e6 e2h5 g8h6) -0.92/10 4} Kb7 {(Kc8-b7) +1.87/8 3} 
14. Bxb6 {(a7b6 b7b6 d2c4 b6b7 c4e5 g6e6 e2h5 g8h6 h5f3 d8d2) -0.93/10 4} 
cxb6 {(c7xb6) +3.27/8 4} 15. Nf3 {(d2f3 f8c5 f3e5 g6f5 g2g4 f5f4 a1d1 d8d1 
e1d1 g8e7) -1.16/10 4} Bd6 {(Bf8-d6) +0.31/8 13} 16. Rad1 {(a1d1 g6f5 f3e5 
g8e7 e2e4 h8f8 d1d3 f8e8 d3g3 f5e4) -1.31/10 4} Nf6 {(Ng8-f6) +0.51/7 10} 
17. Rd4 {(d1d4 h8e8 e2d1 e8e1 f3e1 f6e4 e1f3 d8e8 f3d2 e4d2) -2.08/10 4} 
Rhe8 {(Rh8-e8) +0.67/7 2} 18. Qd1 {(e2d1 e8e1 f3e1 f6e4 e1f3 d8e8 f3d2 e4d2 
d1d2 d6e5) -2.04/10 4} Rxe1+ {(Re8xe1+) +0.15/8 2} 19. Nxe1 {(f3e1 b7c7 
e1f3 c6c5 d4d2 b6b5 d1e2 c5c4 e2e3 c7b7 f3e5 g6h5 e3e2 h5e2) -2.23/14 4} 
Ne4 {(Nf6-e4) -0.12/8 5} 20. Qd3 {(d1d3 d8e8 e1f3 b6b5 f3d2 f7f5 d3f3 e4g5 
f3d1 f5f4) -2.03/10 4} Re8 {(Rd8-e8) +0.03/7 6} 21. Nf3 {(e1f3 b6b5 f3h4 
g6g5 g2g3 g5e5 d4e4 e5e4 d3d6 e4c2) -2.08/10 4} Bc5 {(Bd6-c5) +1.00/7 1} 
22. Ne5 {(f3e5 g6f6 d4d7 b7b8 e5g4 c5f2 g1f1 f6f5 g4f2 f5f2) -2.68/10 4} 
Rxe5 {(Re8xe5) +2.80/9 4} 23. Rd7+ {(d4d7 b7b8 d7d8 b8a7 d8d7 a7b8 d3a6 
a8b8 a6b7) 0.00/11 4} Kc8 {(Kb7-c8) -0.05/8 2} 24. Rd8+ {(d7d8 c8b7 d3d7 
b7a6 d8a8 a6b5 d7d3 b5b4 a2a3) +M5/11 4} Kb7 {(Kc8-b7) -M4/9 3} 25. Qd7+ 
{(d3d7 b7a6 d8a8 a6b5 d7d3 b5b4 a2a3 a8a7) +M4/10 4} Ka6 {(Kb7-a6) -M3/9 7} 
26. Ra8+ {(d8a8 a6b5 d7d3 b5b4 a2a3) +M3/11 3} Kb5 {(Ka6-b5) -M2/8 3} 27. 
Qd3+ {(d7d3 b5b4 a2a3) +M2/11 4} Kb4 {(Kb5-b4) -M1/9 2} 28. Qb3# {(d3b3 
b4c4) +M1/11 3} 1-0

lithander · Post by **lithander** » Wed Feb 02, 2022 10:57 am

Mike Sherwin wrote: ↑Wed Feb 02, 2022 5:50 am Too many words for me to keep in memory. The way I do this in RomiChess and Bricabrac is while generating moves all captures are accumulated in one u64. After all moves have been generated it becomes as easy as if (captures & king[1 - stm]) return ILLEGAL;

Sorry for being so verbose. I'll try to keep my posts shorter in the future.

Do you use the 'captures' bitboard for anything else?

Mike Sherwin wrote: ↑Wed Feb 02, 2022 5:50 am A two piece sacrifice.

It's astonishing to me how well an engine plays already that has no features other than mvv-lva move sorting, alpha/beta (q)search and PSTs. After adding a TT it would probably see something like the forced mate sequence it was getting caught in, too.

Mike Sherwin · Post by **Mike Sherwin** » Wed Feb 02, 2022 1:02 pm

lithander wrote: ↑Wed Feb 02, 2022 10:57 am
Mike Sherwin wrote: ↑Wed Feb 02, 2022 5:50 am Too many words for me to keep in memory. The way I do this in RomiChess and Bricabrac is while generating moves all captures are accumulated in one u64. After all moves have been generated it becomes as easy as if (captures & king[1 - stm]) return ILLEGAL;
Sorry for being so verbose. I'll try to keep my posts shorter in the future.

Do you use the 'captures' bitboard for anything else?

Mike Sherwin wrote: ↑Wed Feb 02, 2022 5:50 am A two piece sacrifice.
It's astonishing to me how well an engine plays already that has no features other than mvv-lva move sorting, alpha/beta (q)search and PSTs. After adding a TT it would probably see something like the forced mate sequence it was getting caught in, too.

On a brighter note, Leorik did win the match. Congrats!

Leorik.Engine - Bricabrac : 51.5/100 37-34-29 (==01001=10==1001===0==000=10110011100=111101010=11=000=00=11=0111=1111=01==100==1=1=0111=0==00=10100) 52% +14

mvanthoor · Post by **mvanthoor** » Wed Feb 02, 2022 4:49 pm

lithander wrote: ↑Wed Feb 02, 2022 10:57 am It's astonishing to me how well an engine plays already that has no features other than mvv-lva move sorting, alpha/beta (q)search and PSTs. After adding a TT it would probably see something like the forced mate sequence it was getting caught in, too.

Tapered evaluation was a game changer, IMHO. In Rustic (as in MMC, and probably any engine with a similar feature set) the addition of a tapered PST adds 250 Elo off the bat. In the 90's, with the slower computers of the time, it was really hard to create an engine that played over 2000 Elo; you really needed all the tricks and optimizations you could get. Nowadays computers are fast enough to go 8-10 moves deep even with a very basic setup such as the one Rustic Alpha 3 now uses, in combination with a tapered PST, and they already reach >=2100 Elo on the CCRL-list. These engines are already very hard to defeat tactically, for most amateur players. (Though they can quite easily be outplayed positionally.)

I really look forward to getting out of the XBoard quagmire (and even get through with writing the tuner) so I can add some strength to my engine again.

lithander · Post by **lithander** » Sat Feb 05, 2022 2:15 am

mvanthoor wrote: ↑Wed Feb 02, 2022 4:49 pm Tapered evaluation was a game changer, IMHO. In Rustic (as in MMC, and probably any engine with a similar feature set) the addition of a tapered PST adds 250 Elo off the bat.

I've now added a TT to Leorik. And I thought that would make my feature set comparable with Rustic Alpha 2 with only two differences:
- Leorik has the tapered eval that Minimal Chess used in version 0.4.
- Leorik does not yet do any move ordering based on the TT's best-move.

For minimal chess playing the best move from a previous iteration first was such a huge win that I thought this would maybe balance out but in fact Leorik enjoys a comfortable lead so the benefit of the tapered eval must indeed be huge!

Code: Select all

Score of Leorik 0.2 vs Rustic 2: 823 - 205 - 170  [0.758] 1198
...      Leorik 0.2 playing White: 412 - 109 - 78  [0.753] 599
...      Leorik 0.2 playing Black: 411 - 96 - 92  [0.763] 599
...      White vs Black: 508 - 520 - 170  [0.495] 1198
Elo difference: 198.3 +/- 20.6, LOS: 100.0 %, DrawRatio: 14.2 %

The next features I'm going to add are: TT move first, PVS search and killer moves.

Each of them should only make the engine faster but shouldn't cause it to miss anything going on within it's horizon. This opens up another way of testing the engine for correctness: It should be able to solve all mate puzzles in the minimal amount of iterative deepenings. So a mate in 5 should be found at depth 9. A mate in 6 at depth 11 and a mate in 7 exactly at depth 13 and so on...

mvanthoor · Post by **mvanthoor** » Sat Feb 05, 2022 6:27 pm

lithander wrote: ↑Sat Feb 05, 2022 2:15 am For minimal chess playing the best move from a previous iteration first was such a huge win that I thought this would maybe balance out but in fact Leorik enjoys a comfortable lead so the benefit of the tapered eval must indeed be huge!
Code: Select all
Score of Leorik 0.2 vs Rustic 2: 823 - 205 - 170  [0.758] 1198
...      Leorik 0.2 playing White: 412 - 109 - 78  [0.753] 599
...      Leorik 0.2 playing Black: 411 - 96 - 92  [0.763] 599
...      White vs Black: 508 - 520 - 170  [0.495] 1198
Elo difference: 198.3 +/- 20.6, LOS: 100.0 %, DrawRatio: 14.2 %
The next features I'm going to add are: TT move first, PVS search and killer moves.

Yes, current Rustic 4-dev (which implements only tapered evaluation on top of Alpha 3 with regard to new chess playing capability) is ~2160 CCLR blitz, which is ~290 points stronger than Rustic Alpha 3. I know that about 40-50 Elo comes from some tweaks in time management and TT-handling, especially on faster time controls. Thus ~240-250 Elo has to come from the tapered evaluation.

If you have MMV-LVA, TT, TT-move ordering, killer moves and a tapered evaluation, you should be exactly on par with Rustic 4-dev, feature-wise.

Yesterday I have finally finished the (massive...) refactoring process and started the implementation of the last two XBoard things: handling of official FIDE-rule draws by insufficient material (which are _different_ from draws with insufficient material that can't force mate... I never knew that, TBH), and GameTime time management.

After this is done I can _finally_ start on the tuner, and that would be the last part of Rustic 4. Then I can _finally_ drop Alpha from the name, and implement Null Move and Static Null move for version 5.

mvanthoor · Post by **mvanthoor** » Sat Feb 05, 2022 7:24 pm

I can't delete the last two sentences anymore. I've just written a more extensive progress post / rant.

Update on Rustic

lithander · Post by **lithander** » Sat Feb 05, 2022 11:22 pm

I thought I could update here whenever I add another feature and post how much ELO in self play that brought. That was information I was really missing when writing MinimalChess... I was never sure if a feature was working correctly because it wasn't clear how much benefit to expect.

So adding the TT to avoid researching know positions and avoid stupid draws was worth 220 ELO.

Code: Select all

Score of Leorik 0.2 vs Leorik 0.1: 1133 - 126 - 444  [0.796] 1703
Elo difference: 236.1 +/- 15.8, LOS: 100.0 %, DrawRatio: 26.1 %

(This test also included a minor change of the PSTs worth 16 ELO in isolation. 236 - 16 = 220)

Adding TT-move ordering, in so far, as I'm now storing the best (known) move for a position in the TT and play that first before generating any captures or quiets was worth another 150 ELO.

Code: Select all

Score of Leorik 0.2.3 vs Leorik 0.2: 712 - 220 - 269  [0.705] 1201
Elo difference: 151.2 +/- 18.4, LOS: 100.0 %, DrawRatio: 22.4 %

To roughly(!!) guess the CCRL Elo of the current version I played against MinimalChess 0.5 which is rated 2267. MinimalChess 0.5 is still 112 ELO (+-21) better and so a very dumb guess would put Leorik 0.2.3 at 2150 CCRL ELO.

Code: Select all

Score of Leorik 0.2.3 vs MinimalChess 0.5: 192 - 453 - 188  [0.343] 833
Elo difference: -112.6 +/- 21.5, LOS: 0.0 %, DrawRatio: 22.6 %

mvanthoor wrote: ↑Sat Feb 05, 2022 6:27 pm If you have MMV-LVA, TT, TT-move ordering, killer moves and a tapered evaluation, you should be exactly on par with Rustic 4-dev, feature-wise.

I don't have killer moves yet and I don't know if you use PVS in Rustic 4-dev? Because that's what I'm going to implement next, even before killer moves.

Of course with every feature the raw NPS suffers slightly. Writing and reading the TT isn't free! But to the best of my knowledge I'm counting nodes-per-second the same as in MinimalChess and the speed difference is very satisfying. MMC 0.5 reports around 1.3M nps while Leorik does around 9M nps in a midgame position.

mvanthoor · Post by **mvanthoor** » Sun Feb 06, 2022 2:25 am

lithander wrote: ↑Sat Feb 05, 2022 11:22 pm ...

The feature set of my current dev version is MVV-LVA, TT, TT-move ordering, killers, PVS, and tapered evaluation (using MinimalChess' tables; first version. I didn't update to any retuned tables).

I find your numbers hard to believe. Did you add the TT after the tapered evaluation? Maybe you are seeing strength gains that you would normally have seen in the tapered evaluation, if you had added the TT first and THEN the tapered eval.

I added the TT right after the basic version, and the TT-cuts + TT-move sorting gained ~150 Elo, which was in line with other engines at the same time control. I just started a match with my dev version against itself, with one of them having 0 MB for the TT, which will obviously disable it. It seems to hover roughly around -185 for the version without TT:

Code: Select all

Score of Rustic Alpha 3.43.100 vs Rustic Alpha NoTT 3.43.100: 581 - 135 - 204 [0.742]
...      Rustic Alpha 3.43.100 playing White: 319 - 60 - 81  [0.782] 460
...      Rustic Alpha 3.43.100 playing Black: 262 - 75 - 123  [0.703] 460
...      White vs Black: 394 - 322 - 204  [0.539] 920
Elo difference: 183.9 +/- 21.6, LOS: 100.0 %, DrawRatio: 22.2 %
920 of 1500 games finished.

There are about 40 points of refactoring and tweaking in this version, that weren't in Alpha 1 and 2, so these could account for the difference between ~145 and ~185 Elo.

Also, you're searching 9M moves/sec? I assume that is due to your faster CPU. I'll have to run the dev-version you sent me, because while I can believe that Leorik is _much_ faster than MinimalChess, I can't really believe that it is faster than Rustic, on the same hardware, especially because you seem to have chosen a bitboard implementation that is different from, and slower than Fancy Magic. (According to dangi's tests where he runs all the different bitboard algorithms in a speed test).

Devlog of Leorik

Re: Devlog of Leorik

Re: Devlog of Leorik

Re: Devlog of Leorik

Re: Devlog of Leorik

Re: Devlog of Leorik

Re: Devlog of Leorik

Re: Devlog of Leorik

Re: Devlog of Leorik

Re: Devlog of Leorik

Re: Devlog of Leorik