Hard-Talkchess-2020 set, final release

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

User avatar
AlexChess
Posts: 1524
Joined: Sat Feb 06, 2021 8:06 am
Full name: Alex Morales

Re: Hard-Talkchess-2020 set, final release

Post by AlexChess »

Vinvin wrote: Wed Dec 15, 2021 1:49 am Added Crystal 3.1
+ one run with Crystal 3.2
+ some corrections in the sheet. Some formulas at far right was wrong.

Results with the new computer.

Conditions :
- CPU : 5950X running around 3.9 GHz
- 16 threads used (to avoid hyper-threading)
- 5 minutes per position
- 6 man Syzygy + KRPPKRP
- 64 GB HashTable (still more than 50 GB for Syzygy cache)
- Several runs are needed because multithread is used. I fill the sheet and I use some formula to get average numbers.
- AVX2 version used everywhere when possible

List sorted from best to worst :
Name / #found-average / (Time with penalty in seconds) / number of runs
Blue Marlin 14.4a : 101,2 (84) on 11 runs
SugaR AI ICCF 1.90 : 101,0 (82) on 6 runs
ShashChess18.2 : 100,5 (92) on 11 runs
ShashChess20.1 : 98,7 (99) on 10 runs
ShashChess20 : 98,4 (101) on 10 runs
Crystal 3.2 : 98,0 (99) on 6 runs
Blue Marlin 14.5 : 98,0 (102) on 13 runs
ShashChess17.1 : 98,0 (111) on 6 runs
SugaR AI ICCF 2.40 : 96,8 (103) on 14 runs
Crystal 3.1 : 96,4 (108) on 8 runs
Honey-v14 : 96,3 (116) on 8 runs
SugaR AI ICCF 2.50 : 93,9 (117) on 10 runs
Stockfish_21.11.23.21 : 92,4 (141) on 5 runs
Stockfish_21.08.05.16 : 91,2 (149) on 5 runs

Sheet with all results here (you can download the file if it doesn't display very well in the browser) : https://www.dropbox.com/s/ckc1fb1gfjm1h ... 5.ods?dl=0

Still to do : remove some doubtful positions posted in this thread.
After so many games in tournaments I think that test solving is the more accurate an impartial method to know the real strength of an engine :D
Chess engines and dedicated chess computers fan since 1981 :D Mac mini M1 8GB-256GB, Windows 11 & Ubuntu ARM64.
ProteusSF Dev Forum TROLLS KINDERGARTEN
Vinvin
Posts: 5228
Joined: Thu Mar 09, 2006 9:40 am
Full name: Vincent Lejeune

Re: Hard-Talkchess-2020 set, final release

Post by Vinvin »

As it was asked in this thread, the sheet with the positions (FEN) in the last column : https://www.dropbox.com/s/ee8snilqtuu2b ... 9.ods?dl=0
Vinvin
Posts: 5228
Joined: Thu Mar 09, 2006 9:40 am
Full name: Vincent Lejeune

Re: Hard-Talkchess-2020 set, final release

Post by Vinvin »

I removed 6 positions considered as incorrect or not good enough.
So 108 positions remain.

Code: Select all

Position #038 not good : several moves are wining
Position #069 not good : several moves are wining
Position #144 is not good enough because 3 moves are clearly winning.
Position #169 is not good enough because 2 moves are clearly winning : Ng5 and Bxh7+
Position #177 is not good enough because 2 moves are clearly winning : N3h4 and Qxe1
Position #187 is not good enough because 2 moves are clearly winning : Ng4+ and Kg7
Only 6 positions removed, but that's enough to change a bit the ranking :

List sorted from best to worst :

Code: Select all

Name / #found-average / (Avg Time with penalty in seconds) / number of runs
ShashChess18.2 :         98,1  (109) on 11 runs
SugaR AI ICCF 1.90 :     98,0  (103) on 6 runs
Blue Marlin 14.4a :      97,5  (109) on 11 runs
ShashChess20.1 :         96,4  (116) on 10 runs
ShashChess20 :           95,9  (119) on 10 runs
ShashChess17.1 :         95,2  (131) on 6 runs
Blue Marlin 14.5 :       94,9  (125) on 13 runs
Crystal 3.2 :            94,0  (127) on 6 runs
SugaR AI ICCF 2.40 :     93,6  (126) on 14 runs
Honey-v14 :              93,4  (137) on 8 runs
Crystal 3.1 :            93,3  (131) on 8 runs
Stockfish_21.11.23.21 :  91,0  (156) on 5 runs
SugaR AI ICCF 2.50 :     90,3  (142) on 10 runs
Stockfish_21.08.05.16 :  88,8  (169) on 5 runs 
Sheet with the 108 positions and timings : https://www.dropbox.com/s/8dqjc2t5pvfrw ... 9.ods?dl=0
Paloma
Posts: 1167
Joined: Thu Dec 25, 2008 9:07 pm
Full name: Herbert L

Re: Hard-Talkchess-2020 set, final release

Post by Paloma »

Vinvin wrote: Sun Dec 19, 2021 10:50 pm

Code: Select all

Position #038 not good : several moves are wining
Position #069 not good : several moves are wining
Position #144 is not good enough because 3 moves are clearly winning.
Position #169 is not good enough because 2 moves are clearly winning : Ng5 and Bxh7+
Position #177 is not good enough because 2 moves are clearly winning : N3h4 and Qxe1
Position #187 is not good enough because 2 moves are clearly winning : Ng4+ and Kg7
Thank you Vincent, excellent found with your new PC.

Little mistake: is not Pos 187 but Pos 185 (1...Kg7 and 1...Ng4+)
Vinvin
Posts: 5228
Joined: Thu Mar 09, 2006 9:40 am
Full name: Vincent Lejeune

Re: Hard-Talkchess-2020 set, final release

Post by Vinvin »

Paloma wrote: Sun Dec 19, 2021 11:23 pm
Vinvin wrote: Sun Dec 19, 2021 10:50 pm

Code: Select all

Position #038 not good : several moves are wining
Position #069 not good : several moves are wining
Position #144 is not good enough because 3 moves are clearly winning.
Position #169 is not good enough because 2 moves are clearly winning : Ng5 and Bxh7+
Position #177 is not good enough because 2 moves are clearly winning : N3h4 and Qxe1
Position #187 is not good enough because 2 moves are clearly winning : Ng4+ and Kg7
Thank you Vincent, excellent found with your new PC.

Little mistake: is not Pos 187 but Pos 185 (1...Kg7 and 1...Ng4+)
Yes ! Thanks for the correction.
It was a typo, files are OK.
Jouni
Posts: 3293
Joined: Wed Mar 08, 2006 8:15 pm

Re: Hard-Talkchess-2020 set, final release

Post by Jouni »

Lc0 result with fast GPU would be interesting. Even with GTX card it's better than SF14 in my PC. In CSS forum one result: Lc0 0.28.2 Netz 610889 got 80/114 with RTX3060 and 30s limit!
Jouni
User avatar
AlexChess
Posts: 1524
Joined: Sat Feb 06, 2021 8:06 am
Full name: Alex Morales

Re: Hard-Talkchess-2020 set, final release

Post by AlexChess »

Hi to all!

I have downloaded EPD 2021 test with 65 positions. Is it the right one? If not, have you an updated download link?
How many seconds are allowed for each position?
How do you balance hardware speed differences? (with my hardware using 4 CPUs I get ~2 Mn / s with Stockfish NNUE on starting position)
What is the meaning of "runs"?

Thanks in advance,

Alex
Chess engines and dedicated chess computers fan since 1981 :D Mac mini M1 8GB-256GB, Windows 11 & Ubuntu ARM64.
ProteusSF Dev Forum TROLLS KINDERGARTEN
peter
Posts: 3186
Joined: Sat Feb 16, 2008 7:38 am
Full name: Peter Martan

Re: Hard-Talkchess-2020 set, final release

Post by peter »

AlexChess wrote: Fri Mar 04, 2022 2:21 pm I have downloaded EPD 2021 test with 65 positions. Is it the right one? If not, have you an updated download link?
How many seconds are allowed for each position?
How do you balance hardware speed differences? (with my hardware using 4 CPUs I get ~2 Mn / s with Stockfish NNUE on starting position)
What is the meaning of "runs"?
Here

forum3/viewtopic.php?p=884039&sid=631ec ... bf#p884039

Vincent posted the 114- subset of the earlier bigger collections.
Then he removed 6 more of those:

forum3/viewtopic.php?p=915515#p915515

so 108 are left.
His regular TC is 30 minutes/position single threaded to avoid SMP- spreading.
To get even more statistical reliability you can have more then one run of each position and engine to average the results regards
Peter.
User avatar
AlexChess
Posts: 1524
Joined: Sat Feb 06, 2021 8:06 am
Full name: Alex Morales

Re: Hard-Talkchess-2020 set, final release

Post by AlexChess »

peter wrote: Sun Mar 06, 2022 12:12 am
AlexChess wrote: Fri Mar 04, 2022 2:21 pm I have downloaded EPD 2021 test with 65 positions. Is it the right one? If not, have you an updated download link?
How many seconds are allowed for each position?
How do you balance hardware speed differences? (with my hardware using 4 CPUs I get ~2 Mn / s with Stockfish NNUE on starting position)
What is the meaning of "runs"?
Here

forum3/viewtopic.php?p=884039&sid=631ec ... bf#p884039

Vincent posted the 114- subset of the earlier bigger collections.
Then he removed 6 more of those:

forum3/viewtopic.php?p=915515#p915515

so 108 are left.
His regular TC is 30 minutes/position single threaded to avoid SMP- spreading.
To get even more statistical reliability you can have more then one run of each position and engine to average the results regards
Thank you Peter for your answer :-) On ERET 111 positions I've used the right TC luckily, considering that my hardware is very slow compared to top Ryzen computers, I'll try also this subset :-)

forum3/viewtopic.php?f=6&t=79238&sid=22 ... 80#p922202

Kind regards, Alex
Chess engines and dedicated chess computers fan since 1981 :D Mac mini M1 8GB-256GB, Windows 11 & Ubuntu ARM64.
ProteusSF Dev Forum TROLLS KINDERGARTEN
User avatar
AlexChess
Posts: 1524
Joined: Sat Feb 06, 2021 8:06 am
Full name: Alex Morales

Re: Hard-Talkchess-2020 set, final release

Post by AlexChess »

AlexChess wrote: Sun Mar 06, 2022 7:54 am
peter wrote: Sun Mar 06, 2022 12:12 am
AlexChess wrote: Fri Mar 04, 2022 2:21 pm I have downloaded EPD 2021 test with 65 positions. Is it the right one? If not, have you an updated download link?
How many seconds are allowed for each position?
How do you balance hardware speed differences? (with my hardware using 4 CPUs I get ~2 Mn / s with Stockfish NNUE on starting position)
What is the meaning of "runs"?
Here

forum3/viewtopic.php?p=884039&sid=631ec ... bf#p884039

Vincent posted the 114- subset of the earlier bigger collections.
Then he removed 6 more of those:

forum3/viewtopic.php?p=915515#p915515

so 108 are left.
His regular TC is 30 minutes/position single threaded to avoid SMP- spreading.
To get even more statistical reliability you can have more then one run of each position and engine to average the results regards
Thank you Peter for your answer :-) On ERET 111 positions I've used the right TC luckily, considering that my hardware is very slow compared to top Ryzen computers, I'll try also this subset :-)

forum3/viewtopic.php?f=6&t=79238&sid=22 ... 80#p922202

Kind regards, Alex
Hi!

I would like to know the results for my ProteusSF RBE 008a Stockfish 15-dev derivative, but I have issues with my poor hardware. Could somebody with a powerful PC kindly test it for me? https://banksiagui.com/forums/viewtopic.php?p=132#p132

On my Windows 11 ARM64 running on a Mac M1 using 1 CPU I can reach only 500 kN/S (with all 4 CPUs it becomes too hot).
How many seconds are allowed in this case for solving each position? A good Ryzen 9 PC calculates 60 Mn/s, so I think in my case 1 hour for position would be fair (now limiting time to 6 minutes: Analyzing... 26 of 44 matching moves Rated time: 2:14:29

Thanks in advance!
Alex
Chess engines and dedicated chess computers fan since 1981 :D Mac mini M1 8GB-256GB, Windows 11 & Ubuntu ARM64.
ProteusSF Dev Forum TROLLS KINDERGARTEN