Breaking News, KiSS Faster Than PEXT In Leorik

Discussion of chess software programming and technical issues.

Moderator: Ras

Mike Sherwin
Posts: 965
Joined: Fri Aug 21, 2020 1:25 am
Location: Planet Earth, Sol system
Full name: Michael J Sherwin

Breaking News, KiSS Faster Than PEXT In Leorik

Post by Mike Sherwin »

lithander wrote: Sat Apr 01, 2023 3:28 pm But what's really nice is that it seems to perform on par with PEXT in a selfplay match. My theory is that in a real engine the cache is under higher pressure than when you just compute perft. And then the smaller lookup tables of KiSS give it an edge over PEXT and it catches up.

Here are 10K games at 5s + 500ms time control against Dangi's binary.

Code: Select all

Score of Leorik-2.4-KiSS-Net8 vs Leorik-2.4-Dangi: 2664 - 2417 - 4919  [0.512] 10000
...      Leorik-2.4-KiSS-Net8 playing White: 1300 - 995 - 2705  [0.530] 5000
...      Leorik-2.4-KiSS-Net8 playing Black: 1364 - 1422 - 2214  [0.494] 5000
...      White vs Black: 2722 - 2359 - 4919  [0.518] 10000
Elo difference: 8.6 +/- 4.8, LOS: 100.0 %, DrawRatio: 49.2 %
And my own PEXT implementation against Dangi's with the same time control but fewer games:

Code: Select all

Score of Leorik-2.4-Pext-Net8 vs Leorik-2.4-D: 478 - 441 - 1039  [0.509] 1958
...      Leorik-2.4-Pext-Net8 playing White: 246 - 211 - 524  [0.518] 981
...      Leorik-2.4-Pext-Net8 playing Black: 232 - 230 - 515  [0.501] 977
...      White vs Black: 476 - 443 - 1039  [0.508] 1958
Elo difference: 6.6 +/- 10.5, LOS: 88.9 %, DrawRatio: 53.1 %
Thanks to Thomas Jahn for this demonstration. :D
dangi12012
Posts: 1062
Joined: Tue Apr 28, 2020 10:03 pm
Full name: Daniel Infuehr

Re: Breaking News, KiSS Faster Than PEXT In Leorik

Post by dangi12012 »

This is awesome news! - lithander even proved that higher nps in perft does not garuantee more engine nps (the hidden cost of cache contention) and engines do indeed need some form of TT.
Why? - pext will lookup a value and evict TT lines from cache. Now TT look ups need to go all the way to memory again.

Your movegen and the ones with lower or even 0 table size do bypass this - making engine nps higher!
This makes me want to add a random dram access in concurrently in my movegen comparison repo highlighting this difference and correcting this 'misranking' of nps in isolation vs cache contention environment. (Making me want to create a radar plot for all algos)

It is great to see that researchers and developers in the chess engine community continue to push the boundaries of what is possible! Especially replacing algorithms that were standard for the past 20 years or so.

Your algorithm is encircling pext and fancy magic in the radar plot so to speak.
Worlds-fastest-Bitboard-Chess-Movegenerator
Daniel Inführ - Software Developer
Mike Sherwin
Posts: 965
Joined: Fri Aug 21, 2020 1:25 am
Location: Planet Earth, Sol system
Full name: Michael J Sherwin

Re: Breaking News, KiSS Faster Than PEXT In Leorik

Post by Mike Sherwin »

dangi12012 wrote: Sat Apr 01, 2023 7:41 pm This is awesome news! - lithander even proved that higher nps in perft does not garuantee more engine nps (the hidden cost of cache contention) and engines do indeed need some form of TT.
Why? - pext will lookup a value and evict TT lines from cache. Now TT look ups need to go all the way to memory again.

Your movegen and the ones with lower or even 0 table size do bypass this - making engine nps higher!
This makes me want to add a random dram access in concurrently in my movegen comparison repo highlighting this difference and correcting this 'misranking' of nps in isolation vs cache contention environment. (Making me want to create a radar plot for all algos)

It is great to see that researchers and developers in the chess engine community continue to push the boundaries of what is possible! Especially replacing algorithms that were standard for the past 20 years or so.

Your algorithm is encircling pext and fancy magic in the radar plot so to speak.
Hi Daniel,
I just noticed your post. Thanks for the kind words. And I agree strongly that some kind of simulated TT needs to be included to arrive at the truth. However, there are hobby and tutorial engines that do not have a TT. So I think it needs to be done both ways.