Discussion of computer chess matches and engine tournaments.
Moderators: hgm , Rebel , chrisw
Vinvin
Posts: 5228 Joined: Thu Mar 09, 2006 9:40 am
Full name: Vincent Lejeune
Post
by Vinvin » Sun Jun 07, 2020 1:22 am
2 more engines in the tournament : stockfish_2020_06_0117 and Eman 5.60
Code: Select all
# PLAYER : RATING POINTS PLAYED (%)
1 Stockfish_2020_06_0117_x64_modern : 3371.4 534.5 1000 53
2 Honey-XI : 3359.7 1398.0 2400 58
3 Stockfish_2020_04_15_x64_modern : 3357.5 1294.5 2200 59
4 Eman 5.40 Default : 3356.9 610.0 1200 51
5 Eman 5.60 : 3356.7 509.5 1000 51
6 corchess_V6_x64_modern : 3356.4 1287.0 2200 59
7 Eman 5.40 FOD=12 : 3314.4 633.0 1200 53
8 Bluefish-XI : 3303.8 1204.0 2400 50
9 matefinder_2020_04_15 : 3303.0 789.0 1600 49
10 Black-Diamond-XI : 3286.6 656.0 1400 47
11 Crystal-2020-04-08-pop : 3271.8 623.5 1400 45
12 Houdini-6-modern : 3212.2 496.0 1400 35
13 Bluefish-FD-XI (Tac=2,Def=N) : 3049.7 165.0 1000 17
White advantage = 42.24
Draw rate (equal opponents) = 50.00 %
MMarco
Posts: 195 Joined: Sun Apr 12, 2020 1:09 am
Full name: Marc-O Moisan-Plante
Post
by MMarco » Sun Jun 07, 2020 8:19 am
Vinvin wrote: ↑ Sun Jun 07, 2020 1:22 am
2 more engines in the tournament : stockfish_2020_06_0117 and Eman 5.60
Code: Select all
# PLAYER : RATING POINTS PLAYED (%)
1 Stockfish_2020_06_0117_x64_modern : 3371.4 534.5 1000 53
2 Honey-XI : 3359.7 1398.0 2400 58
3 Stockfish_2020_04_15_x64_modern : 3357.5 1294.5 2200 59
4 Eman 5.40 Default : 3356.9 610.0 1200 51
5 Eman 5.60 : 3356.7 509.5 1000 51
6 corchess_V6_x64_modern : 3356.4 1287.0 2200 59
7 Eman 5.40 FOD=12 : 3314.4 633.0 1200 53
8 Bluefish-XI : 3303.8 1204.0 2400 50
9 matefinder_2020_04_15 : 3303.0 789.0 1600 49
10 Black-Diamond-XI : 3286.6 656.0 1400 47
11 Crystal-2020-04-08-pop : 3271.8 623.5 1400 45
12 Houdini-6-modern : 3212.2 496.0 1400 35
13 Bluefish-FD-XI (Tac=2,Def=N) : 3049.7 165.0 1000 17
White advantage = 42.24
Draw rate (equal opponents) = 50.00 %
How would do vanilla Stockfish 11? Something like 3350?
Xavier Bouchat
Posts: 1 Joined: Tue Aug 01, 2017 10:15 pm
Location: Belgium
Post
by Xavier Bouchat » Sun Jun 07, 2020 10:28 am
I don't know how badly this influenced the final result, but it looks like there is a bug in Stockfish's time management when using multiPV. And Bluefish-XI with Tactical = 2 is a hidden MultiPV 4 ...
The variable bestValue contains the score of the last searched PV and shouldn't be used to decide wheter or not the search has to be stopped (code from Honey-XI / Bluefish-XI, the part defined (Stockfish) is still present in current dev Stockfish):
#if defined (Stockfish) || (Weakfish)
double fallingEval = (332 + 6 * (mainThread->previousScore - bestValue)
+ 6 * (mainThread->iterValue[iterIdx] - bestValue)) / 704.0;
#else
double fallingEval = (354 + 10 * (mainThread->previousScore - bestValue)) / 692.0;
#endif
fallingEval = clamp(fallingEval, 0.5, 1.5);
rootMoves[0].score (the score of the best PV line) should be used in place of bestValue.
Only with MultiPV 1 are the 2 variables always equal. With MultiPV > 1, this is only the case when the last searched PV appears to be the best root move or when there are multiple root moves with the same score. In all the other cases the difference will be slight (for example in closed positions) to huge (for example when taking a piece is the only reasonable move).
MMarco
Posts: 195 Joined: Sun Apr 12, 2020 1:09 am
Full name: Marc-O Moisan-Plante
Post
by MMarco » Sun Jun 07, 2020 3:13 pm
I ran games with Bluefish tactical=2 at 40/2min repeating, and I can confirm it had a very bad time management, keeping often a second or less for the 10 last moves before the time control with an obvious negative impact on the result. I too found it was almost 300 elo weaker than regular Stockfish which was surprisingly low.
MikeB
Posts: 4889 Joined: Thu Mar 09, 2006 6:34 am
Location: Pen Argyl, Pennsylvania
Post
by MikeB » Sun Jun 07, 2020 3:48 pm
MMarco wrote: ↑ Sun Jun 07, 2020 3:13 pm
I ran games with Bluefish tactical=2 at 40/2min repeating, and I can confirm it had a very bad time management, keeping often a second or less for the 10 last moves before the time control with an obvious negative impact on the result. I too found it was almost 300 elo weaker than regular Stockfish which was surprisingly low.
That's about right - tactical 2 is for analysis only but it's very bad at time management and plays much weaker. Last night I also noticed it breaks "go mate x"
Vinvin
Posts: 5228 Joined: Thu Mar 09, 2006 9:40 am
Full name: Vincent Lejeune
Post
by Vinvin » Tue Sep 22, 2020 1:30 pm
Added stockfish_12 who played against 4 opponents : stockfish_20060117, Eman 5.60, corchess_V6 and Bluefish-XI
Code: Select all
# PLAYER : RATING POINTS PLAYED (%)
1 stockfish_12_x64_modern : 3467.9 543.0 800 68
2 stockfish_20060117_x64_modern : 3356.9 601.5 1200 50
3 Honey-XI : 3346.8 1398.0 2400 58
4 Eman 5.60 64-bit POPCNT : 3346.4 580.5 1200 48
5 stockfish_2020_04_15_x64_modern : 3344.7 1294.5 2200 59
6 Eman 5.40 Default 64-bit POPCNT : 3344.1 610.0 1200 51
7 corchess_V6_x64_modern : 3342.2 1349.5 2400 56
8 Eman 5.40 FOD=12 64-bit POPCNT : 3301.4 633.0 1200 53
9 Bluefish-XI : 3291.6 1260.5 2600 48
10 matefinder_2020_04_15 : 3290.0 789.0 1600 49
11 Black-Diamond-XI : 3273.6 656.0 1400 47
12 Crystal-2020-04-08-pop : 3258.8 623.5 1400 45
13 Houdini-6-modern : 3199.1 496.0 1400 35
14 Bluefish-FD-XI : 3036.6 165.0 1000 17
White advantage = 42.61
Draw rate (equal opponents) = 50.00 %
Vinvin
Posts: 5228 Joined: Thu Mar 09, 2006 9:40 am
Full name: Vincent Lejeune
Post
by Vinvin » Sun Jan 10, 2021 8:32 am
Following the test on the hard test suite :
http://talkchess.com/forum3/viewtopic.p ... 01#p877501
New tournament with : CF EXT 281220 x64 (C-Fish from 28/12/2020), Bluefish v12-R1, ShashChess 15.1, Stockfish 12, Crystal 291220 and SugaR AI ICCF 1.00
NN file used for all the engines : nn-62ef826d1a6d.nnue
Conclusions : Crystal and SugaR AI ICCF are clearly weaker than others. Bluefish who solved a lot of tests is quite strong in the game play. Note that Bluefish v12-R1 is quite old (beginning of October ?), it could be improved by around 10 rating points by simply importing latest SF improvements.
Code: Select all
Result:
---------------------------------------------------------------------------------------
# name games wins% draws% losses% score los% elo+/-
1. CF EXT 281220 x64 SSE41 N 1000 12.5 85.2 2.3 551.0 100.0 35.6
2. Bluefish v12-R1 1000 8.3 87.7 4.0 521.5 100.0 14.9
3. ShashChess 15.1 1000 8.2 87.3 4.5 518.5 99.9 12.9
4. Stockfish 12 1000 7.6 86.1 6.3 506.5 86.5 4.5
5. Crystal 291220 1000 3.7 83.2 13.1 453.0 0.0 -32.8
6. SugaR AI ICCF 1.00 1000 3.4 83.1 13.5 449.5 0.0 -35.2
Cross table:
---------------------------------------------------------------------------------------
# name score games 1 2 3 4 5 6
1. CF EXT 281220 x64 SSE41 N 551.0 1000 x 101.5 103.5 107.5 118.5 120.0
2. Bluefish v12-R1 521.5 1000 98.5 x 102.0 101.0 112.0 108.0
3. ShashChess 15.1 518.5 1000 96.5 98.0 x 101.0 109.5 113.5
4. Stockfish 12 506.5 1000 92.5 99.0 99.0 x 107.5 108.5
5. Crystal 291220 453.0 1000 81.5 88.0 90.5 92.5 x 100.5
6. SugaR AI ICCF 1.00 449.5 1000 80.0 92.0 86.5 91.5 99.5 x
Tech:
---------------------------------------------------------------------------------------
Tech (average nodes, depths, time/m per move, others per game), counted for computing moves only, ignored moves with zero nodes:
# name nodes/m NPS depth/m time/m moves time #fails
1. CF EXT 281220 x64 SSE41 N 6629K 1700367 40.2 3.9 79.5 309.9
2. Bluefish v12-R1 5482K 1356142 30.0 4.0 75.0 303.2 1
3. ShashChess 15.1 4310K 1102948 35.1 3.9 78.7 307.4
4. Stockfish 12 4938K 1269195 36.7 3.9 79.7 310.0
5. Crystal 291220 4574K 1087830 33.1 4.2 75.0 315.6
6. SugaR AI ICCF 1.00 4395K 1020266 32.2 4.3 71.9 309.9
all --- 4948K 1255466 34.6 4.0 76.6 309.3 1
MikeB
Posts: 4889 Joined: Thu Mar 09, 2006 6:34 am
Location: Pen Argyl, Pennsylvania
Post
by MikeB » Sun Jan 10, 2021 2:10 pm
Vinvin wrote: ↑ Sun Jan 10, 2021 8:32 am
Following the test on the hard test suite :
http://talkchess.com/forum3/viewtopic.p ... 01#p877501
New tournament with : CF EXT 281220 x64 (C-Fish from 28/12/2020), Bluefish v12-R1, ShashChess 15.1, Stockfish 12, Crystal 291220 and SugaR AI ICCF 1.00
NN file used for all the engines : nn-62ef826d1a6d.nnue
Conclusions : Crystal and SugaR AI ICCF are clearly weaker than others. Bluefish who solved a lot of tests is quite strong in the game play. Note that Bluefish v12-R1 is quite old (beginning of October ?), it could be improved by around 10 rating points by simply importing latest SF improvements.
Code: Select all
Result:
---------------------------------------------------------------------------------------
# name games wins% draws% losses% score los% elo+/-
1. CF EXT 281220 x64 SSE41 N 1000 12.5 85.2 2.3 551.0 100.0 35.6
2. Bluefish v12-R1 1000 8.3 87.7 4.0 521.5 100.0 14.9
3. ShashChess 15.1 1000 8.2 87.3 4.5 518.5 99.9 12.9
4. Stockfish 12 1000 7.6 86.1 6.3 506.5 86.5 4.5
5. Crystal 291220 1000 3.7 83.2 13.1 453.0 0.0 -32.8
6. SugaR AI ICCF 1.00 1000 3.4 83.1 13.5 449.5 0.0 -35.2
Cross table:
---------------------------------------------------------------------------------------
# name score games 1 2 3 4 5 6
1. CF EXT 281220 x64 SSE41 N 551.0 1000 x 101.5 103.5 107.5 118.5 120.0
2. Bluefish v12-R1 521.5 1000 98.5 x 102.0 101.0 112.0 108.0
3. ShashChess 15.1 518.5 1000 96.5 98.0 x 101.0 109.5 113.5
4. Stockfish 12 506.5 1000 92.5 99.0 99.0 x 107.5 108.5
5. Crystal 291220 453.0 1000 81.5 88.0 90.5 92.5 x 100.5
6. SugaR AI ICCF 1.00 449.5 1000 80.0 92.0 86.5 91.5 99.5 x
Tech:
---------------------------------------------------------------------------------------
Tech (average nodes, depths, time/m per move, others per game), counted for computing moves only, ignored moves with zero nodes:
# name nodes/m NPS depth/m time/m moves time #fails
1. CF EXT 281220 x64 SSE41 N 6629K 1700367 40.2 3.9 79.5 309.9
2. Bluefish v12-R1 5482K 1356142 30.0 4.0 75.0 303.2 1
3. ShashChess 15.1 4310K 1102948 35.1 3.9 78.7 307.4
4. Stockfish 12 4938K 1269195 36.7 3.9 79.7 310.0
5. Crystal 291220 4574K 1087830 33.1 4.2 75.0 315.6
6. SugaR AI ICCF 1.00 4395K 1020266 32.2 4.3 71.9 309.9
all --- 4948K 1255466 34.6 4.0 76.6 309.3 1
latest Bluefish
https://github.com/MichaelB7/Stockfish/ ... 9-Eval.exe
Vinvin
Posts: 5228 Joined: Thu Mar 09, 2006 9:40 am
Full name: Vincent Lejeune
Post
by Vinvin » Sun Jan 10, 2021 7:41 pm
Thanks but it doesn't work on my computer
My computer can run "SSE41+POPCNT". "Intel SSE4.2"+ "Intel AVX" is OK. But not AVX2.
Vinvin
Posts: 5228 Joined: Thu Mar 09, 2006 9:40 am
Full name: Vincent Lejeune
Post
by Vinvin » Thu Jan 14, 2021 7:42 pm
Vinvin wrote: ↑ Sun Jan 10, 2021 8:32 am
Code: Select all
Result:
---------------------------------------------------------------------------------------
# name games wins% draws% losses% score los% elo+/-
1. CF EXT 281220 x64 SSE41 N 1000 12.5 85.2 2.3 551.0 100.0 35.6
2. Bluefish v12-R1 1000 8.3 87.7 4.0 521.5 100.0 14.9
3. ShashChess 15.1 1000 8.2 87.3 4.5 518.5 99.9 12.9
4. Stockfish 12 1000 7.6 86.1 6.3 506.5 86.5 4.5
5. Crystal 291220 1000 3.7 83.2 13.1 453.0 0.0 -32.8
6. SugaR AI ICCF 1.00 1000 3.4 83.1 13.5 449.5 0.0 -35.2
...
I added Eman 6.80 and SugaR AI 1.30 in the tournament. I removed the 2 weaker engines.
Time control is 3m+2s.
Rating list with Ordo :
Code: Select all
# PLAYER : RATING POINTS PLAYED (%)
1 CF EXT 281220 x64 SSE41 N : 3617.7 513.0 1000 51.3%
2 Eman 6.80 64-bit SSE41 POPCNT : 3617.4 512.5 1000 51.3%
3 SugaR AI 1.30 : 3611.5 502.5 1000 50.3%
4 Bluefish v12-R1 : 3609.7 499.5 1000 50.0%
5 ShashChess 15.1 : 3603.8 489.5 1000 49.0%
6 Stockfish 12 : 3600.0 483.0 1000 48.3%
White advantage = 32.13
Draw rate (equal opponents) = 50.00 %