Testrun Reckless-0.10-dev, 27.000 games

Discussion of computer chess matches and engine tournaments.

Moderator: Ras

User avatar
Rebel
Posts: 7539
Joined: Thu Aug 18, 2011 12:04 pm
Full name: Ed Schröder

Testrun Reckless-0.10-dev, 27.000 games

Post by Rebel »

Code: Select all

Results from file Reckless-0.90-final.pgn: |  Results from file Reckless-0.10-dev.pgn:
                                           |  
No. Name                 Score Games   %   |  No. Name                  Score  Games  %
------------------------------------------ |---------------------------------------------
  1 Reckless-0.90-final 6349.5 27000 60.6% |    1 Reckless-0.10.0-dev  6501.5 27000 61.1%
  2 Stockfish-18        1565.5  3000 52.2% |    2 Stockfish-18         1532.5  3000 51.1%
  3 PlentyChess-7.0.22  1362.5  3000 45.4% |    3 PlentyChess-7.0.22   1366.0  3000 45.5%
  4 Alexandria-9.0      1252.0  3000 41.7% |    4 Obsidian-16          1223.0  3000 40.8%
  5 Obsidian-16         1235.0  3000 41.2% |    5 Alexandria-9.0       1210.0  3000 40.3%
  6 Stockfish-15        1195.5  3000 39.9% |    6 Stockfish-15         1170.5  3000 39.0%
  7 Viridithas-19.0.1   1063.5  3000 35.5% |    7 Viridithas-19.0.1    1068.0  3000 35.6%
  8 Caissa-1.24         1059.5  3000 35.3% |    8 Caissa-1.24          1042.5  3000 34.8%
  9 Clover-9.1          1005.5  3000 33.5% |    9 Clover-9.1            978.5  3000 32.6%
 10 Berserk-13           911.5  3000 30.4% |   10 Berserk-13            907.5  3000 30.2%
                                           |   
Total Games:   27000                       |   Total Games:   27000
White Wins:     7922 (29.3%)               |   White Wins:     7984 (29.6%)
Black Wins:     1713 (6.3%)                |   Black Wins:     1643 (6.1%)
Draws:         17365 (64.3%)               |   Draws:         17373 (64.3%)
Left 0.90 vs right 0.10.0
27.000 games
TC 40m/10s

Overall gain 3-4 elo
Closing in on SF 7-8 elo

Top-10 rating list now :

Code: Select all

   # PLAYER                   :  RATING  ERROR   POINTS  PLAYED     W      D      L  D(%)
   1 Leela-0.32.1-BT4         :  3806.6    3.6   3154.5    6000   996   4317    687    72
   2 Stockfish-18             :  3801.9    1.3  21833.5   36000  0177  23313   2510    65
   3 Reckless-0.10.0-dev      :  3788.8    2.0  16501.5   27000  7815  17373   1812    64
   4 PlentyChess-7.0.37       :  3758.9    1.7  26308.5   44874  1691  29235   3948    65
   5 Alexandria-9.0           :  3720.8    2.0  16424.0   31865  5679  21490   4696    67
   6 Obsidian-16              :  3718.1    1.6  40501.5   80652  3384  54235  13033    67
   7 Viridithas-19.0.1        :  3676.3    1.6  17338.5   38874  4851  24975   9048    64
   8 Caissa-1.24              :  3666.8    1.9  27149.5   62652  6835  40629  15188    65
   9 Clover-9.1               :  3651.7    1.7  33272.0   80649  7181  52182  21286    65
  10 Berserk-13               :  3637.6    1.4  31782.5   80652  7293  48979  24380    61
90% of coding is debugging, the other 10% is writing bugs.
Jouni
Posts: 3874
Joined: Wed Mar 08, 2006 8:15 pm
Full name: Jouni Uski

Re: Testrun Reckless-0.10-dev, 27.000 games

Post by Jouni »

Overall gain 3-4 elo. But Stockfish has also gained 3-4 after version 18.
Jouni
Uri Blass
Posts: 11201
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: Testrun Reckless-0.10-dev, 27.000 games

Post by Uri Blass »

Rebel wrote: Thu Apr 02, 2026 11:36 am

Code: Select all

Results from file Reckless-0.90-final.pgn: |  Results from file Reckless-0.10-dev.pgn:
                                           |  
No. Name                 Score Games   %   |  No. Name                  Score  Games  %
------------------------------------------ |---------------------------------------------
  1 Reckless-0.90-final 6349.5 27000 60.6% |    1 Reckless-0.10.0-dev  6501.5 27000 61.1%
  2 Stockfish-18        1565.5  3000 52.2% |    2 Stockfish-18         1532.5  3000 51.1%
  3 PlentyChess-7.0.22  1362.5  3000 45.4% |    3 PlentyChess-7.0.22   1366.0  3000 45.5%
  4 Alexandria-9.0      1252.0  3000 41.7% |    4 Obsidian-16          1223.0  3000 40.8%
  5 Obsidian-16         1235.0  3000 41.2% |    5 Alexandria-9.0       1210.0  3000 40.3%
  6 Stockfish-15        1195.5  3000 39.9% |    6 Stockfish-15         1170.5  3000 39.0%
  7 Viridithas-19.0.1   1063.5  3000 35.5% |    7 Viridithas-19.0.1    1068.0  3000 35.6%
  8 Caissa-1.24         1059.5  3000 35.3% |    8 Caissa-1.24          1042.5  3000 34.8%
  9 Clover-9.1          1005.5  3000 33.5% |    9 Clover-9.1            978.5  3000 32.6%
 10 Berserk-13           911.5  3000 30.4% |   10 Berserk-13            907.5  3000 30.2%
                                           |   
Total Games:   27000                       |   Total Games:   27000
White Wins:     7922 (29.3%)               |   White Wins:     7984 (29.6%)
Black Wins:     1713 (6.3%)                |   Black Wins:     1643 (6.1%)
Draws:         17365 (64.3%)               |   Draws:         17373 (64.3%)
Left 0.90 vs right 0.10.0
27.000 games
TC 40m/10s

Overall gain 3-4 elo
Closing in on SF 7-8 elo

Top-10 rating list now :

Code: Select all

   # PLAYER                   :  RATING  ERROR   POINTS  PLAYED     W      D      L  D(%)
   1 Leela-0.32.1-BT4         :  3806.6    3.6   3154.5    6000   996   4317    687    72
   2 Stockfish-18             :  3801.9    1.3  21833.5   36000  0177  23313   2510    65
   3 Reckless-0.10.0-dev      :  3788.8    2.0  16501.5   27000  7815  17373   1812    64
   4 PlentyChess-7.0.37       :  3758.9    1.7  26308.5   44874  1691  29235   3948    65
   5 Alexandria-9.0           :  3720.8    2.0  16424.0   31865  5679  21490   4696    67
   6 Obsidian-16              :  3718.1    1.6  40501.5   80652  3384  54235  13033    67
   7 Viridithas-19.0.1        :  3676.3    1.6  17338.5   38874  4851  24975   9048    64
   8 Caissa-1.24              :  3666.8    1.9  27149.5   62652  6835  40629  15188    65
   9 Clover-9.1               :  3651.7    1.7  33272.0   80649  7181  52182  21286    65
  10 Berserk-13               :  3637.6    1.4  31782.5   80652  7293  48979  24380    61
6501.5/27000 is illogical and not 61.1%
It seems that the score of reckless10 is not 6501.5 but 16501.5(same idea for reckless9)
User avatar
Rebel
Posts: 7539
Joined: Thu Aug 18, 2011 12:04 pm
Full name: Ed Schröder

Re: Testrun Reckless-0.10-dev, 27.000 games

Post by Rebel »

Uri Blass wrote: Fri Apr 03, 2026 10:09 am
Rebel wrote: Thu Apr 02, 2026 11:36 am

Code: Select all

Results from file Reckless-0.90-final.pgn: |  Results from file Reckless-0.10-dev.pgn:
                                           |  
No. Name                 Score Games   %   |  No. Name                  Score  Games  %
------------------------------------------ |---------------------------------------------
  1 Reckless-0.90-final 16349.5 27000 60.6%|    1 Reckless-0.10.0-dev 16501.5 27000 61.1%
  2 Stockfish-18        1565.5  3000 52.2% |    2 Stockfish-18         1532.5  3000 51.1%
  3 PlentyChess-7.0.22  1362.5  3000 45.4% |    3 PlentyChess-7.0.22   1366.0  3000 45.5%
  4 Alexandria-9.0      1252.0  3000 41.7% |    4 Obsidian-16          1223.0  3000 40.8%
  5 Obsidian-16         1235.0  3000 41.2% |    5 Alexandria-9.0       1210.0  3000 40.3%
  6 Stockfish-15        1195.5  3000 39.9% |    6 Stockfish-15         1170.5  3000 39.0%
  7 Viridithas-19.0.1   1063.5  3000 35.5% |    7 Viridithas-19.0.1    1068.0  3000 35.6%
  8 Caissa-1.24         1059.5  3000 35.3% |    8 Caissa-1.24          1042.5  3000 34.8%
  9 Clover-9.1          1005.5  3000 33.5% |    9 Clover-9.1            978.5  3000 32.6%
 10 Berserk-13           911.5  3000 30.4% |   10 Berserk-13            907.5  3000 30.2%
                                           |   
Total Games:   27000                       |   Total Games:   27000
White Wins:     7922 (29.3%)               |   White Wins:     7984 (29.6%)
Black Wins:     1713 (6.3%)                |   Black Wins:     1643 (6.1%)
Draws:         17365 (64.3%)               |   Draws:         17373 (64.3%)
Left 0.90 vs right 0.10.0
27.000 games
TC 40m/10s

Overall gain 3-4 elo
Closing in on SF 7-8 elo

Top-10 rating list now :

Code: Select all

   # PLAYER                   :  RATING  ERROR   POINTS  PLAYED     W      D      L  D(%)
   1 Leela-0.32.1-BT4         :  3806.6    3.6   3154.5    6000   996   4317    687    72
   2 Stockfish-18             :  3801.9    1.3  21833.5   36000  0177  23313   2510    65
   3 Reckless-0.10.0-dev      :  3788.8    2.0  16501.5   27000  7815  17373   1812    64
   4 PlentyChess-7.0.37       :  3758.9    1.7  26308.5   44874  1691  29235   3948    65
   5 Alexandria-9.0           :  3720.8    2.0  16424.0   31865  5679  21490   4696    67
   6 Obsidian-16              :  3718.1    1.6  40501.5   80652  3384  54235  13033    67
   7 Viridithas-19.0.1        :  3676.3    1.6  17338.5   38874  4851  24975   9048    64
   8 Caissa-1.24              :  3666.8    1.9  27149.5   62652  6835  40629  15188    65
   9 Clover-9.1               :  3651.7    1.7  33272.0   80649  7181  52182  21286    65
  10 Berserk-13               :  3637.6    1.4  31782.5   80652  7293  48979  24380    61
6501.5/27000 is illogical and not 61.1%
It seems that the score of reckless10 is not 6501.5 but 16501.5(same idea for reckless9)
You are right, the "1" fell off during editing to get the 2 versions fit in code space, fixed it.
90% of coding is debugging, the other 10% is writing bugs.
User avatar
Rebel
Posts: 7539
Joined: Thu Aug 18, 2011 12:04 pm
Full name: Ed Schröder

Re: Testrun Reckless-0.10-dev, 27.000 games

Post by Rebel »

Jouni wrote: Fri Apr 03, 2026 8:48 am Overall gain 3-4 elo. But Stockfish has also gained 3-4 after version 18.
Look again, SF18 lost 3-4 elo.

Code: Select all

# PLAYER                   :  RATING  ERROR   POINTS  PLAYED      W      D      L  D(%)
2 Stockfish-18             :  3805.2    2.0  20301.0   33000   9727  21148   2125    64   
See the not yet updated page : https://rebel7775.wixsite.com/rebel/stc-rating-list
90% of coding is debugging, the other 10% is writing bugs.
Jouni
Posts: 3874
Joined: Wed Mar 08, 2006 8:15 pm
Full name: Jouni Uski

Re: Testrun Reckless-0.10-dev, 27.000 games

Post by Jouni »

I hope You write exact test conditions to page. 1 core test. And book is?
Jouni
cc2150dx
Posts: 461
Joined: Sat Nov 30, 2013 9:51 am
Full name: Jason Coombs

Re: Testrun Reckless-0.10-dev, 27.000 games

Post by cc2150dx »

Hey Ed, Leela-0.32.1-BT4 What settings are you using ? and which BT4 net ?
User avatar
Rebel
Posts: 7539
Joined: Thu Aug 18, 2011 12:04 pm
Full name: Ed Schröder

Re: Testrun Reckless-0.10-dev, 27.000 games

Post by Rebel »

cc2150dx wrote: Fri Apr 03, 2026 5:36 pm Hey Ed, Leela-0.32.1-BT4 What settings are you using ? and which BT4 net ?
1. Default settings
2. BT4-it332 net
90% of coding is debugging, the other 10% is writing bugs.
User avatar
Rebel
Posts: 7539
Joined: Thu Aug 18, 2011 12:04 pm
Full name: Ed Schröder

Re: Testrun Reckless-0.10-dev, 27.000 games

Post by Rebel »

Jouni wrote: Fri Apr 03, 2026 12:41 pm I hope You write exact test conditions to page. 1 core test. And book is?
Good points

1. Single thread
2. 1500 openings

Updated the page accordingly.

https://rebel7775.wixsite.com/rebel/stc-rating-list
90% of coding is debugging, the other 10% is writing bugs.
cc2150dx
Posts: 461
Joined: Sat Nov 30, 2013 9:51 am
Full name: Jason Coombs

Re: Testrun Reckless-0.10-dev, 27.000 games

Post by cc2150dx »

Rebel wrote: Sun Apr 05, 2026 7:37 am
cc2150dx wrote: Fri Apr 03, 2026 5:36 pm Hey Ed, Leela-0.32.1-BT4 What settings are you using ? and which BT4 net ?
1. Default settings
2. BT4-it332 net
Thanks :)