BoCC -- Beauty of Computer Chess

Discussion of anything and everything relating to chess playing software and machines.

Moderator: Ras

User avatar
Rebel
Posts: 7339
Joined: Thu Aug 18, 2011 12:04 pm
Full name: Ed Schröder

Re: BoCC -- Beauty of Computer Chess

Post by Rebel »

chrisw wrote: Thu Aug 07, 2025 5:34 pm
Rebel wrote: Thu Aug 07, 2025 5:25 pm
pohl4711 wrote: Thu Aug 07, 2025 8:13 am By the way: The stats of your new Rebel Extreme beta are looking very promising!
A clear improvement in all single-stats and besides Patricia 5 the only engine with more than 50% sacs (of all won games) in your gamebase. Really impressive. Cant wait to test this one!
I will mail you the version, please keep it private. Meanwhile I will try to lower the similarity, I am not so sure if that is possible.
presumably the similarity is because of unchanged search?
funny how these things are sticky in the sim testing
Yes.

I am trying v20 with king-space-32, maybe it will help, although I doubt it.
90% of coding is debugging, the other 10% is writing bugs.
User avatar
pohl4711
Posts: 2759
Joined: Sat Sep 03, 2011 7:25 am
Location: Berlin, Germany
Full name: Stefan Pohl

Re: BoCC -- Beauty of Computer Chess

Post by pohl4711 »

Rebel wrote: Thu Aug 07, 2025 5:22 pm
pohl4711 wrote: Thu Aug 07, 2025 7:03 am I think, I understand whats happend. Its not a bug in the tools, you mismatched results, using Gauntlet-EAS tool and/or EAS-tool with the hardcoded movelimit for short-wins bonus and the normal EAS-tool (here this movelimit is calculated out of the average length of all won games of the source.pgn). This leads (of course) to completely different values for the shorts-stat. But not only Patricia, this affects of course, all engines in a source.pgn.
LOL, now it's my fault :D

I am glad you tagged the problem, just tell me step by step which version to download and how to run it.
This depends on what you want to do with the EAS-Tool. The hardcoded movelimit versions in the for_engine_developers-folder should be only used, if you really need this hardcoded number.
This is from the ReadMe:
Normally the EAS-Tool calculates the shortwin movelimit (the limit, below an engine gets bonuspoints for a won game, because it is a short win) based on the average length of all won games in the input.pgn But this can lead to unstable results, when engine-developers play gauntlets with their new engine-version vs. a bunch of opponents (compared to the EAS-results, when a RoundRobin is evaluated).

Example: Rebel Extreme dev is tested by you in a gauntlet vs. some opponents (normal case). Because Rebel has much more short wins than a "normal" engine, here the calculation of the average length of won games (normally done by the EAS-tool) in such a Gauntlet-gamebase would lead to a very low number. Lower than the average length of won games in full RoundRobin-gamebase. So, Rebel Extreme Gauntlets would lead to higher EAS-scores than expected, because the average length of won games is too small in the Gauntlet-calculation.
The hardcoded EAS-Tools offer the opportunity to fix this by hardcoding the movelimit for short-wins. You can open the .bat file with an editor and set the hardcoded movelimit to any number.

These are the first lines of the tools:
REM **************************************************************************************
REM *** special hardcoded shortwin_movelimit here, change to other values, if you want ***
REM *** set it to 0, to deactivate the hardcode-override
set /A hardlimit=60
REM **************************************************************************************


Example 1: Average won game length in the source.pgn is 78 moves: Rounded to 75 and -15 = 60 is the
upper limit, followed by 55, 50, 45, 40
Example 2: Average won game length is 58 moves: Rounded to 55 and -15 = 40 is the upper limit,
followed by 35, 30, 25, 20

In all other cases, I would always recommend to use only the "normal" EAS-Tools in the main folder.
User avatar
pohl4711
Posts: 2759
Joined: Sat Sep 03, 2011 7:25 am
Location: Berlin, Germany
Full name: Stefan Pohl

Re: BoCC -- Beauty of Computer Chess

Post by pohl4711 »

Rebel wrote: Thu Aug 07, 2025 5:25 pm
pohl4711 wrote: Thu Aug 07, 2025 8:13 am By the way: The stats of your new Rebel Extreme beta are looking very promising!
A clear improvement in all single-stats and besides Patricia 5 the only engine with more than 50% sacs (of all won games) in your gamebase. Really impressive. Cant wait to test this one!
I will mail you the version, please keep it private. Meanwhile I will try to lower the similarity, I am not so sure if that is possible.
That would be great! But please mention, a test of a private version of an engine by me, stays private: I do not include private engines in my ratinglists. So, I would love to do a testrun, but the results are only for me and you. Of course, you can offer the results on your website - fine for me.
User avatar
Rebel
Posts: 7339
Joined: Thu Aug 18, 2011 12:04 pm
Full name: Ed Schröder

Re: BoCC -- Beauty of Computer Chess

Post by Rebel »

pohl4711 wrote: Fri Aug 08, 2025 6:14 am
Rebel wrote: Thu Aug 07, 2025 5:22 pm
pohl4711 wrote: Thu Aug 07, 2025 7:03 am I think, I understand whats happend. Its not a bug in the tools, you mismatched results, using Gauntlet-EAS tool and/or EAS-tool with the hardcoded movelimit for short-wins bonus and the normal EAS-tool (here this movelimit is calculated out of the average length of all won games of the source.pgn). This leads (of course) to completely different values for the shorts-stat. But not only Patricia, this affects of course, all engines in a source.pgn.
LOL, now it's my fault :D

I am glad you tagged the problem, just tell me step by step which version to download and how to run it.
This depends on what you want to do with the EAS-Tool. The hardcoded movelimit versions in the for_engine_developers-folder should be only used, if you really need this hardcoded number.
This is from the ReadMe:
Normally the EAS-Tool calculates the shortwin movelimit (the limit, below an engine gets bonuspoints for a won game, because it is a short win) based on the average length of all won games in the input.pgn But this can lead to unstable results, when engine-developers play gauntlets with their new engine-version vs. a bunch of opponents (compared to the EAS-results, when a RoundRobin is evaluated).

Example: Rebel Extreme dev is tested by you in a gauntlet vs. some opponents (normal case). Because Rebel has much more short wins than a "normal" engine, here the calculation of the average length of won games (normally done by the EAS-tool) in such a Gauntlet-gamebase would lead to a very low number. Lower than the average length of won games in full RoundRobin-gamebase. So, Rebel Extreme Gauntlets would lead to higher EAS-scores than expected, because the average length of won games is too small in the Gauntlet-calculation.
The hardcoded EAS-Tools offer the opportunity to fix this by hardcoding the movelimit for short-wins. You can open the .bat file with an editor and set the hardcoded movelimit to any number.

These are the first lines of the tools:
REM **************************************************************************************
REM *** special hardcoded shortwin_movelimit here, change to other values, if you want ***
REM *** set it to 0, to deactivate the hardcode-override
set /A hardlimit=60
REM **************************************************************************************


Example 1: Average won game length in the source.pgn is 78 moves: Rounded to 75 and -15 = 60 is the
upper limit, followed by 55, 50, 45, 40
Example 2: Average won game length is 58 moves: Rounded to 55 and -15 = 40 is the upper limit,
followed by 35, 30, 25, 20

In all other cases, I would always recommend to use only the "normal" EAS-Tools in the main folder.
I get the same numbers you posted with the two special versions, so that's okay.

But IMO you should make these 2 versions the standard, out of the box.

Your tool should be always produce reliable numbers on any PGN.
90% of coding is debugging, the other 10% is writing bugs.
User avatar
Rebel
Posts: 7339
Joined: Thu Aug 18, 2011 12:04 pm
Full name: Ed Schröder

Re: BoCC -- Beauty of Computer Chess

Post by Rebel »

pohl4711 wrote: Fri Aug 08, 2025 6:16 am
Rebel wrote: Thu Aug 07, 2025 5:25 pm
pohl4711 wrote: Thu Aug 07, 2025 8:13 am By the way: The stats of your new Rebel Extreme beta are looking very promising!
A clear improvement in all single-stats and besides Patricia 5 the only engine with more than 50% sacs (of all won games) in your gamebase. Really impressive. Cant wait to test this one!
I will mail you the version, please keep it private. Meanwhile I will try to lower the similarity, I am not so sure if that is possible.
That would be great! But please mention, a test of a private version of an engine by me, stays private: I do not include private engines in my ratinglists. So, I would love to do a testrun, but the results are only for me and you. Of course, you can offer the results on your website - fine for me.
Okay thanks, I need some time to make up my mind.
90% of coding is debugging, the other 10% is writing bugs.
User avatar
Rebel
Posts: 7339
Joined: Thu Aug 18, 2011 12:04 pm
Full name: Ed Schröder

Re: BoCC -- Beauty of Computer Chess

Post by Rebel »

I have been too harsh (actually wrong) on myself regarding similarity. For a moment I forgot (that on SIMEX) there are no similarity checks on new versions of a particular engine because they often have a high similarity, a newer version normally has a high overlap with the previous one. There are no oranges and reds in the html. Take the history of Stockfish for example :

Code: Select all

Positions  8238 Stock Stock Stock Stock Stock Stock Stock Stock
Stockfish-14.1  ----- 54.56 49.90 50.50 48.31 49.76 47.76 47.99
Stockfish-15.1  49.90 49.18 ----- 63.04 61.20 62.86 59.26 59.15
Stockfish-15    50.50 49.25 63.04 ----- 59.61 61.36 58.58 58.70
Stockfish-16.1  48.31 46.69 61.20 59.61 ----- 65.47 65.10 65.81
Stockfish-16    49.76 47.93 62.86 61.36 65.47 ----- 63.72 63.96
Stockfish-17.1  47.76 46.40 59.26 58.58 65.10 63.72 ----- 90.55
Stockfish-17    47.99 46.18 59.15 58.70 65.81 63.96 90.55 -----
Similarity between 17 and 17.1 is 90%, nothing wrong.

The history of Rebel :

Code: Select all

Positions  8238    Rebel Rebel Rebel Rebel
Rebel-16.3         ----- 60.35 49.93 52.65
Rebel-EAS-2.0      60.35 ----- 45.89 49.48
Rebel-Extreme-Beta 49.93 45.89 ----- 48.63
Rebel-Extreme      52.65 49.48 48.63 -----
All made from the same source code of 2023, nets all considerable different.
90% of coding is debugging, the other 10% is writing bugs.
User avatar
Rebel
Posts: 7339
Joined: Thu Aug 18, 2011 12:04 pm
Full name: Ed Schröder

Re: BoCC -- Beauty of Computer Chess

Post by Rebel »

chrisw wrote: Thu Aug 07, 2025 5:34 pm
Rebel wrote: Thu Aug 07, 2025 5:25 pm
pohl4711 wrote: Thu Aug 07, 2025 8:13 am By the way: The stats of your new Rebel Extreme beta are looking very promising!
A clear improvement in all single-stats and besides Patricia 5 the only engine with more than 50% sacs (of all won games) in your gamebase. Really impressive. Cant wait to test this one!
I will mail you the version, please keep it private. Meanwhile I will try to lower the similarity, I am not so sure if that is possible.
presumably the similarity is because of unchanged search?
funny how these things are sticky in the sim testing
Surprise, I have tried v20 instead of v21 and look what happened with the EAS.

Using my fixed 10.000 ~35xx games elo pool.

Code: Select all

                                 bad  avg.win 
Rank  EAS-Score  sacs   shorts  draws  moves  Engine/player 
-------------------------------------------------------------------
   1    558670  50.14%  73.48%  09.52%   52   Rebel-Extreme-v20-E122  
   2     91192  02.23%  43.57%  38.14%   65   RubiChess-20230410  
   3     72492  02.01%  38.02%  41.47%   73   SF12  
   4     68158  00.63%  37.03%  39.12%   68   Clover-5.0  
   5     66611  01.50%  36.64%  48.95%   68   Seer-2.6.0  
   6     47619  01.82%  29.18%  44.80%   71   Berserk-10  
   7     38292  02.09%  22.54%  36.36%   74   Koivisto-9.0  
   8     37584  02.16%  23.88%  37.68%   76   Revenge-3.0  
   9     34317  01.07%  21.35%  39.85%   74   Rebel-16.2  
  10     22869  02.20%  16.89%  39.07%   80   Ethereal-14.00-NNUE  
  11     21089  03.31%  15.95%  39.85%   80   Igel-3.5.0  
Similarity remained the same, speaking about funny.
90% of coding is debugging, the other 10% is writing bugs.
User avatar
Rebel
Posts: 7339
Joined: Thu Aug 18, 2011 12:04 pm
Full name: Ed Schröder

Re: BoCC -- Beauty of Computer Chess

Post by Rebel »

pohl4711 wrote: Fri Aug 08, 2025 6:16 am
Rebel wrote: Thu Aug 07, 2025 5:25 pm
pohl4711 wrote: Thu Aug 07, 2025 8:13 am By the way: The stats of your new Rebel Extreme beta are looking very promising!
A clear improvement in all single-stats and besides Patricia 5 the only engine with more than 50% sacs (of all won games) in your gamebase. Really impressive. Cant wait to test this one!
I will mail you the version, please keep it private. Meanwhile I will try to lower the similarity, I am not so sure if that is possible.
That would be great! But please mention, a test of a private version of an engine by me, stays private: I do not include private engines in my ratinglists. So, I would love to do a testrun, but the results are only for me and you. Of course, you can offer the results on your website - fine for me.
As you can see in the above I got some crazy EAS result with a newer version. I will redo my 3 x 15.000 games with this version, post the results and then it will be decision time.

BTW, I noticed your shortwin=60 version has some new statistics.

Code: Select all

********************************************************************************************* 
*** EAS single-statistics (6 categories, each with Top5 engines): 
********************************************************************************************* 
A: Most high-value sacrifices (3+ pawnunits)         : [1]:16.52% Rebel-Extreme-v20-E122  
                                                       [2]:00.19% Igel-3.5.0  
                                                       [3]:00.15% SF12  
                                                       [4]:00.15% Ethereal-14.00-NNUE  
                                                       [5]:00.00% Seer-2.6.0  
B: Most sacrifices overall                           : [1]:50.14% Rebel-Extreme-v20-E122  
                                                       [2]:03.31% Igel-3.5.0  
                                                       [3]:02.23% RubiChess-20230410  
                                                       [4]:02.20% Ethereal-14.00-NNUE  
                                                       [5]:02.16% Revenge-3.0  
C: Very short wins (45 moves or less)                : [1]:48.12% Rebel-Extreme-v20-E122  
                                                       [2]:10.05% SF12  
                                                       [3]:07.72% RubiChess-20230410  
                                                       [4]:07.10% Seer-2.6.0  
                                                       [5]:06.01% Clover-5.0  
D: Most short wins overall                           : [1]:73.48% Rebel-Extreme-v20-E122  
                                                       [2]:43.57% RubiChess-20230410  
                                                       [3]:38.02% SF12  
                                                       [4]:37.03% Clover-5.0  
                                                       [5]:36.64% Seer-2.6.0  
E: Average length of all won games                   : [1]:052 Rebel-Extreme-v20-E122  
                                                       [2]:065 RubiChess-20230410  
                                                       [3]:068 Clover-5.0  
                                                       [4]:068 Seer-2.6.0  
                                                       [5]:071 Berserk-10  
F: Smallest number of bad draws                      : [1]:09.52% Rebel-Extreme-v20-E122  
                                                       [2]:36.36% Koivisto-9.0  
                                                       [3]:37.68% Revenge-3.0  
                                                       [4]:38.14% RubiChess-20230410  
                                                       [5]:39.07% Ethereal-14.00-NNUE  
Pretty cool....

One more wish, if possible. A statistic how the EAS points are divided, on sacs, shorties and bad draws.
90% of coding is debugging, the other 10% is writing bugs.
User avatar
pohl4711
Posts: 2759
Joined: Sat Sep 03, 2011 7:25 am
Location: Berlin, Germany
Full name: Stefan Pohl

Re: BoCC -- Beauty of Computer Chess

Post by pohl4711 »

Rebel wrote: Fri Aug 08, 2025 8:32 pm As you can see in the above I got some crazy EAS result with a newer version. I will redo my 3 x 15.000 games with this version, post the results and then it will be decision time.

BTW, I noticed your shortwin=60 version has some new statistics.[/size]

Code: Select all

********************************************************************************************* 
*** EAS single-statistics (6 categories, each with Top5 engines): 
********************************************************************************************* 
A: Most high-value sacrifices (3+ pawnunits)         : [1]:16.52% Rebel-Extreme-v20-E122  
                                                       [2]:00.19% Igel-3.5.0  
                                                       [3]:00.15% SF12  
                                                       [4]:00.15% Ethereal-14.00-NNUE  
                                                       [5]:00.00% Seer-2.6.0  
B: Most sacrifices overall                           : [1]:50.14% Rebel-Extreme-v20-E122  
                                                       [2]:03.31% Igel-3.5.0  
                                                       [3]:02.23% RubiChess-20230410  
                                                       [4]:02.20% Ethereal-14.00-NNUE  
                                                       [5]:02.16% Revenge-3.0  
C: Very short wins (45 moves or less)                : [1]:48.12% Rebel-Extreme-v20-E122  
                                                       [2]:10.05% SF12  
                                                       [3]:07.72% RubiChess-20230410  
                                                       [4]:07.10% Seer-2.6.0  
                                                       [5]:06.01% Clover-5.0  
D: Most short wins overall                           : [1]:73.48% Rebel-Extreme-v20-E122  
                                                       [2]:43.57% RubiChess-20230410  
                                                       [3]:38.02% SF12  
                                                       [4]:37.03% Clover-5.0  
                                                       [5]:36.64% Seer-2.6.0  
E: Average length of all won games                   : [1]:052 Rebel-Extreme-v20-E122  
                                                       [2]:065 RubiChess-20230410  
                                                       [3]:068 Clover-5.0  
                                                       [4]:068 Seer-2.6.0  
                                                       [5]:071 Berserk-10  
F: Smallest number of bad draws                      : [1]:09.52% Rebel-Extreme-v20-E122  
                                                       [2]:36.36% Koivisto-9.0  
                                                       [3]:37.68% Revenge-3.0  
                                                       [4]:38.14% RubiChess-20230410  
                                                       [5]:39.07% Ethereal-14.00-NNUE  
Pretty cool....

One more wish, if possible. A statistic how the EAS points are divided, on sacs, shorties and bad draws.
I added this feature in V5.8. This is the reason, the Gauntlet-version is still on 5.7, because, when computing just one engine, these 1-5 place stats make no sense at all, of course.
I already use them on my website
https://www.sp-cc.de/eas-ratinglist.htm

Stats, how the points are divided would be possible. But the output of the EAS-Tool is already very huge and I made these stats with place 1-5 for a better understanding, where an engine gets more or less points from.
Example: If you look on my website (https://www.sp-cc.de/eas-ratinglist.htm), you see these stats, too. When you look into them, you see for example: Torch plays more directly to the win, compared to Stockfish (more short wins and less average length of all won games).

But if you really need these divided points, I will add this to the EAS-Tool, but this needs a 3rd rankinglist in the output...
User avatar
j.t.
Posts: 268
Joined: Wed Jun 16, 2021 2:08 am
Location: Berlin
Full name: Jost Triller

Re: BoCC -- Beauty of Computer Chess

Post by j.t. »

[pgn]
Score: 2.29 - CCRL - Chess System Tal 2.00 Elo 64-bit
White: Rebel 16.2 64-bit, Black: Chess System Tal 2.00 Elo 64-bit
[BlackElo "3541"]
[WhiteElo "3505"]
[Black "Chess System Tal 2.00 Elo 64-bit"]
[Date "2023.07.23"]
[Opening "Ponziani"]
[Result "0-1"]
[ECO "C44"]
[PlyCount "58"]
[White "Rebel 16.2 64-bit"]
[Event "CCRL 40/15"]
[Site "CCRL"]
[Round "888.2.541"]
[Variation "Fraser defence"]

1. e4 e5 2. Nf3 Nc6 3. c3 Nf6 4. d4 Nxe4 5. d5 Bc5 6. dxc6 Bxf2+ 7. Ke2 Bb6 8. Qd5 Nf2
9. Rg1 O-O 10. cxb7 Bxb7 11. Qxb7 Qf6 12. Qd5 c6 13. Qd2 e4 14. Nd4 e3 15. Qc2 Bxd4 16. cxd4 c5
17. d5 Ng4 18. Rh1 Rfe8 19. Nc3 Nf2 20. Rg1 c4 21. g3 Nd3 22. Rg2 Qd4 23. h3 Rab8 24. a4 Rb6
25. Nb5 Qxd5 26. Nc7 Qh5+ 27. g4 Qh4 28. Qxc4 Qe1+ 29. Kxd3 Qxf1+ 0-1
[/pgn]

Looks like a very aggressive game to me.