I think this shows that the ability to solve these trick positions has little relevance when it comes to the overall playing strength of an engine. Xiphos and Ethereal are much stronger than Houdini 1.5 and Critter 1.6.
At the same time, these positions show us how complex chess is and that there is an immense amount of winning sequences that are completely missed during search. The improvement potential is enormous.
Hard-Talkchess-2020 set, final release
Moderators: hgm, Rebel, chrisw
-
- Posts: 550
- Joined: Tue Nov 19, 2019 8:48 pm
- Full name: Alayan Feh
-
- Posts: 3291
- Joined: Wed Mar 08, 2006 8:15 pm
Re: Hard-Talkchess-2020 set, final release
2 things: 1. Clearly Houdini 1.5 and Critter 1.6 are same engine . 2. And Crystal best in test, when it uses least time. Why? Imagine situation where engine A solves all 213 positions in 15 minutes each, but engine B "only" 212 positions in 15 seconds each. Who is by far better?
Jouni
-
- Posts: 4556
- Joined: Tue Jul 03, 2007 4:30 am
Re: Hard-Talkchess-2020 set, final release
Who knows? Let them play a match and you'll figure it out under some error bars. If solving a test set could tell us what engine was best, TCEC and the CCRL would fire all their volunteers and dedicate their time to solve test sets instead of playing games.
-
- Posts: 5228
- Joined: Thu Mar 09, 2006 9:40 am
- Full name: Vincent Lejeune
Re: Hard-Talkchess-2020 set, final release
Mainlines added for : #1 to #60 : https://lichess.org/study/y6sM5eAfVinvin wrote: ↑Thu Jan 30, 2020 10:34 pmI imported all the positions in Lichess Studies (but not all the mainlines yet) :
#1 to #60 : https://lichess.org/study/y6sM5eAf
#61 to #120 : https://lichess.org/study/Vfe59GOH
#121 to #180 : https://lichess.org/study/0FQa6riN
#181 to #213 : https://lichess.org/study/s2mTErEZ
-
- Posts: 5228
- Joined: Thu Mar 09, 2006 9:40 am
- Full name: Vincent Lejeune
Re: Hard-Talkchess-2020 set, final release
Restarted the live test at
with Black-Diamond-XI, McCain-v2 and Sting-sf-16
with Black-Diamond-XI, McCain-v2 and Sting-sf-16
Vinvin wrote: ↑Mon Feb 03, 2020 1:45 am Current summary of results (30 minutes, 1 core per engine) :
Stockfish 11 : 141/213
Stockfish 10 : 137/213
Black-Diamond-XR7 : 136/213
Crystal 1.1 : 135/213
Houdini 1.5 x64 : 97/213
Critter 1.6a 64bit : 96/213
Xiphos 0.6 : 81/213
Ethereal 11.75 : 77/213
Replays of solving are here :
-
- Posts: 385
- Joined: Sat Feb 04, 2017 11:57 pm
- Location: USA
Re: Hard-Talkchess-2020 set, final release
What is your criteria for entry into the list?
i7-6700K @ 4.00Ghz 32Gb, Win 10 Home, EGTBs on PCI SSD
Benchmark: Stockfish15.1 NNUE x64 bmi2 (nps): 1277K
Benchmark: Stockfish15.1 NNUE x64 bmi2 (nps): 1277K
-
- Posts: 5228
- Joined: Thu Mar 09, 2006 9:40 am
- Full name: Vincent Lejeune
Re: Hard-Talkchess-2020 set, final release
New results :
Sting-SF-16 : 122/213
McCain-v2 : 130/213
Black-Diamond-XI : 136/213
Honey-XI : 139/213
Now with SF 7, SF 8 and Bluefish-XI :
Sting-SF-16 : 122/213
McCain-v2 : 130/213
Black-Diamond-XI : 136/213
Honey-XI : 139/213
Now with SF 7, SF 8 and Bluefish-XI :
Vinvin wrote: ↑Tue Feb 25, 2020 12:49 am Restarted the live test at
with Black-Diamond-XI, McCain-v2 and Sting-sf-16
Vinvin wrote: ↑Mon Feb 03, 2020 1:45 am Current summary of results (30 minutes, 1 core per engine) :
Stockfish 11 : 141/213
Stockfish 10 : 137/213
Black-Diamond-XR7 : 136/213
Crystal 1.1 : 135/213
Houdini 1.5 x64 : 97/213
Critter 1.6a 64bit : 96/213
Xiphos 0.6 : 81/213
Ethereal 11.75 : 77/213
Replays of solving are here :
-
- Posts: 5228
- Joined: Thu Mar 09, 2006 9:40 am
- Full name: Vincent Lejeune
Re: Hard-Talkchess-2020 set, final release
2 more results :
Bluefish-XI : 159/213 NEW LEADER !
Stockfish 7 : 76/213
Current standing :
Bluefish-XI : 159/213
Stockfish 11 : 141/213
Honey-XI : 139/213
Stockfish 10 : 137/213
Black-Diamond-XI : 136/213
Black-Diamond-XR7 : 136/213
Crystal 1.1 : 135/213
McCain-v2 : 130/213
Sting-SF-16 : 122/213
Houdini 1.5 x64 : 97/213
Critter 1.6a 64bit : 96/213
Xiphos 0.6 : 81/213
Ethereal 11.75 : 77/213
Stockfish 7 : 76/213
Still running : Stockfish 8 ... 93/210
Bluefish-XI : 159/213 NEW LEADER !
Stockfish 7 : 76/213
Current standing :
Bluefish-XI : 159/213
Stockfish 11 : 141/213
Honey-XI : 139/213
Stockfish 10 : 137/213
Black-Diamond-XI : 136/213
Black-Diamond-XR7 : 136/213
Crystal 1.1 : 135/213
McCain-v2 : 130/213
Sting-SF-16 : 122/213
Houdini 1.5 x64 : 97/213
Critter 1.6a 64bit : 96/213
Xiphos 0.6 : 81/213
Ethereal 11.75 : 77/213
Stockfish 7 : 76/213
Still running : Stockfish 8 ... 93/210
Vinvin wrote: ↑Sat Feb 29, 2020 10:02 pm New results :
Sting-SF-16 : 122/213
McCain-v2 : 130/213
Black-Diamond-XI : 136/213
Honey-XI : 139/213
Now with SF 7, SF 8 and Bluefish-XI :
Vinvin wrote: ↑Tue Feb 25, 2020 12:49 am Restarted the live test at
with Black-Diamond-XI, McCain-v2 and Sting-sf-16
Vinvin wrote: ↑Mon Feb 03, 2020 1:45 am Current summary of results (30 minutes, 1 core per engine) :
Stockfish 11 : 141/213
Stockfish 10 : 137/213
Black-Diamond-XR7 : 136/213
Crystal 1.1 : 135/213
Houdini 1.5 x64 : 97/213
Critter 1.6a 64bit : 96/213
Xiphos 0.6 : 81/213
Ethereal 11.75 : 77/213
Replays of solving are here :
-
- Posts: 1871
- Joined: Sat Nov 25, 2017 2:28 pm
- Location: France
Re: Hard-Talkchess-2020 set, final release
Please can you recall how much time for each position ? how many threads ? size of TT ?Vinvin wrote: ↑Tue Mar 03, 2020 12:39 am 2 more results :
Bluefish-XI : 159/213 NEW LEADER !
Stockfish 7 : 76/213
Current standing :
Bluefish-XI : 159/213
Stockfish 11 : 141/213
Honey-XI : 139/213
Stockfish 10 : 137/213
Black-Diamond-XI : 136/213
Black-Diamond-XR7 : 136/213
Crystal 1.1 : 135/213
McCain-v2 : 130/213
Sting-SF-16 : 122/213
Houdini 1.5 x64 : 97/213
Critter 1.6a 64bit : 96/213
Xiphos 0.6 : 81/213
Ethereal 11.75 : 77/213
Stockfish 7 : 76/213
Still running : Stockfish 8 ... 93/210
-
- Posts: 5228
- Joined: Thu Mar 09, 2006 9:40 am
- Full name: Vincent Lejeune
Re: Hard-Talkchess-2020 set, final release
30 minutes with 1 core for each position. 2 GB of HT (I set to 2200 MB in Arena).xr_a_y wrote: ↑Tue Mar 03, 2020 7:40 amPlease can you recall how much time for each position ? how many threads ? size of TT ?Vinvin wrote: ↑Tue Mar 03, 2020 12:39 am 2 more results :
Bluefish-XI : 159/213 NEW LEADER !
Stockfish 7 : 76/213
Current standing :
Bluefish-XI : 159/213
Stockfish 11 : 141/213
Honey-XI : 139/213
Stockfish 10 : 137/213
Black-Diamond-XI : 136/213
Black-Diamond-XR7 : 136/213
Crystal 1.1 : 135/213
McCain-v2 : 130/213
Sting-SF-16 : 122/213
Houdini 1.5 x64 : 97/213
Critter 1.6a 64bit : 96/213
Xiphos 0.6 : 81/213
Ethereal 11.75 : 77/213
Stockfish 7 : 76/213
Still running : Stockfish 8 ... 93/210
I'll publish a sheet with all timing in next days.
Vincent