GRL - test runs

Discussion of computer chess matches and engine tournaments.

Moderators: hgm, Rebel, chrisw

User avatar
Rebel
Posts: 6991
Joined: Thu Aug 18, 2011 12:04 pm

GRL - test runs - result Seer 2.1.0

Post by Rebel »

Code: Select all

Gambit Rating List
Running      : Gauntlet Seer 2.1.0
Time Control : Time control 40/120
Games        : 1800

Results from file gauntlet-seer.pgn:

No. Name           Win Draw Loss Unf.  Score Games       %
----------------------------------------------------------
  1 Seer 2.1.0    +573 =558 -669   *0  852.0  1800   47.3%
  2 Lzero v27     +128  =61  -11   *0  158.5   200   79.2%
  3 Booot 6.5      +90  =71  -39   *0  125.5   200   62.8%
  4 rofChade 2.3   +92  =58  -50   *0  121.0   200   60.5%
  5 Berserk 4.3.0  +75  =51  -74   *0  100.5   200   50.2%
  6 Weiss 1.4      +64  =72  -64   *0  100.0   200   50.0%
  7 Stash 31.0     +60  =61  -79   *0   90.5   200   45.2%
  8 Beef 0.3.6     +61  =53  -86   *0   87.5   200   43.8%
  9 Wasp 4.50      +55  =65  -80   *0   87.5   200   43.8%
 10 Halogen 10     +44  =66  -90   *0   77.0   200   38.5%

Total Games:    1800
White Wins:      629 (34.9%)
Black Wins:      613 (34.1%)
Draws:           558 (31.0%)
Unfinished:        0 (0.0%)

Estimated elo gain for Seer_2.1.0
Elo pool : 3184
Seer 2.0.1 : 3132.0
Seer_2.1.0 : 3164.1
Difference : 32.1
Seer 2.1.0 +32

Probably not the result you hoped for. Part of the blame is Lzero v27 (the Lc0 cpu version) that has an incredible score of 79% while its rating is somewhat lower than Seer 2.0.1. The perfect example of an "Angst-Gegner".
90% of coding is debugging, the other 10% is writing bugs.
User avatar
Rebel
Posts: 6991
Joined: Thu Aug 18, 2011 12:04 pm

Re: GRL - test runs - Gauntlet Clover 2.4

Post by Rebel »

Gauntlet Clover 2.4

http://rebel13.nl/a/grl.htm
90% of coding is debugging, the other 10% is writing bugs.
connor_mcmonigle
Posts: 530
Joined: Sun Sep 06, 2020 4:40 am
Full name: Connor McMonigle

Re: GRL - test runs - result Seer 2.1.0

Post by connor_mcmonigle »

Rebel wrote: Wed Jul 14, 2021 8:52 pm

Code: Select all

Gambit Rating List
Running      : Gauntlet Seer 2.1.0
Time Control : Time control 40/120
Games        : 1800

Results from file gauntlet-seer.pgn:

No. Name           Win Draw Loss Unf.  Score Games       %
----------------------------------------------------------
  1 Seer 2.1.0    +573 =558 -669   *0  852.0  1800   47.3%
  2 Lzero v27     +128  =61  -11   *0  158.5   200   79.2%
  3 Booot 6.5      +90  =71  -39   *0  125.5   200   62.8%
  4 rofChade 2.3   +92  =58  -50   *0  121.0   200   60.5%
  5 Berserk 4.3.0  +75  =51  -74   *0  100.5   200   50.2%
  6 Weiss 1.4      +64  =72  -64   *0  100.0   200   50.0%
  7 Stash 31.0     +60  =61  -79   *0   90.5   200   45.2%
  8 Beef 0.3.6     +61  =53  -86   *0   87.5   200   43.8%
  9 Wasp 4.50      +55  =65  -80   *0   87.5   200   43.8%
 10 Halogen 10     +44  =66  -90   *0   77.0   200   38.5%

Total Games:    1800
White Wins:      629 (34.9%)
Black Wins:      613 (34.1%)
Draws:           558 (31.0%)
Unfinished:        0 (0.0%)

Estimated elo gain for Seer_2.1.0
Elo pool : 3184
Seer 2.0.1 : 3132.0
Seer_2.1.0 : 3164.1
Difference : 32.1
Seer 2.1.0 +32

Probably not the result you hoped for. Part of the blame is Lzero v27 (the Lc0 cpu version) that has an incredible score of 79% while its rating is somewhat lower than Seer 2.0.1. The perfect example of an "Angst-Gegner".
Thanks for testing. This is indeed a poor result, though it's more in line with my expectation if we exclude Lc0. The results against Halogen and Wasp, in particular, look promising.
User avatar
Rebel
Posts: 6991
Joined: Thu Aug 18, 2011 12:04 pm

Re: GRL - test runs

Post by Rebel »

Conner, I blundered, I was running the GPU version!

The new result -

Code: Select all

Gambit Rating List
Running      : Gauntlet Seer 2.1.0
Time Control : Time control 40/120
Games        : 1600

Results from file gauntlet-seer.pgn:

No. Name           Win Draw Loss Unf.  Score Games       %
----------------------------------------------------------
  1 Seer 2.1.0    +562 =497 -541   *0  810.5  1600   50.7%
  2 Booot 6.5      +90  =71  -39   *0  125.5   200   62.8%
  3 rofChade 2.3   +92  =58  -50   *0  121.0   200   60.5%
  4 Berserk 4.3.0  +75  =51  -74   *0  100.5   200   50.2%
  5 Weiss 1.4      +64  =72  -64   *0  100.0   200   50.0%
  6 Stash 31.0     +60  =61  -79   *0   90.5   200   45.2%
  7 Beef 0.3.6     +61  =53  -86   *0   87.5   200   43.8%
  8 Wasp 4.50      +55  =65  -80   *0   87.5   200   43.8%
  9 Halogen 10     +44  =66  -90   *0   77.0   200   38.5%

Total Games:    1600
White Wins:      562 (35.1%)
Black Wins:      541 (33.8%)
Draws:           497 (31.1%)
Unfinished:        0 (0.0%)

Estimated elo gain for Seer_2.1.0
Elo pool : 3192
Seer 2.0.1 : 3132.0
Seer_2.1.0 : 3196.0
Difference : 64.0
+64
90% of coding is debugging, the other 10% is writing bugs.
connor_mcmonigle
Posts: 530
Joined: Sun Sep 06, 2020 4:40 am
Full name: Connor McMonigle

Re: GRL - test runs

Post by connor_mcmonigle »

Rebel wrote: Wed Jul 14, 2021 9:44 pm Conner, I blundered, I was running the GPU version!

The new result -

Code: Select all

Gambit Rating List
Running      : Gauntlet Seer 2.1.0
Time Control : Time control 40/120
Games        : 1600

Results from file gauntlet-seer.pgn:

No. Name           Win Draw Loss Unf.  Score Games       %
----------------------------------------------------------
  1 Seer 2.1.0    +562 =497 -541   *0  810.5  1600   50.7%
  2 Booot 6.5      +90  =71  -39   *0  125.5   200   62.8%
  3 rofChade 2.3   +92  =58  -50   *0  121.0   200   60.5%
  4 Berserk 4.3.0  +75  =51  -74   *0  100.5   200   50.2%
  5 Weiss 1.4      +64  =72  -64   *0  100.0   200   50.0%
  6 Stash 31.0     +60  =61  -79   *0   90.5   200   45.2%
  7 Beef 0.3.6     +61  =53  -86   *0   87.5   200   43.8%
  8 Wasp 4.50      +55  =65  -80   *0   87.5   200   43.8%
  9 Halogen 10     +44  =66  -90   *0   77.0   200   38.5%

Total Games:    1600
White Wins:      562 (35.1%)
Black Wins:      541 (33.8%)
Draws:           497 (31.1%)
Unfinished:        0 (0.0%)

Estimated elo gain for Seer_2.1.0
Elo pool : 3192
Seer 2.0.1 : 3132.0
Seer_2.1.0 : 3196.0
Difference : 64.0
+64
Haha, that's much better! You scared me Ed
User avatar
Rebel
Posts: 6991
Joined: Thu Aug 18, 2011 12:04 pm

Re: GRL - test runs

Post by Rebel »

Code: Select all

Gambit Rating List
Running      : Gauntlet Clover 2.4
Time Control : Time control 40/120
Games        : 1600

Results from file gauntlet-clover-240.pgn:

No. Name           Win Draw Loss Unf.  Score Games       %
----------------------------------------------------------
  1 Clover 2.4    +511 =657 -432   *0  839.5  1600   52.5%
  2 rofChade 2.3   +73 =100  -27   *0  123.0   200   61.5%
  3 Booot 6.5      +86  =73  -41   *0  122.5   200   61.2%
  4 Berserk 4.3.0  +69  =85  -46   *0  111.5   200   55.8%
  5 Halogen 10     +51  =75  -74   *0   88.5   200   44.2%
  6 Weiss 1.4      +48  =69  -83   *0   82.5   200   41.2%
  7 Wasp 4.50      +38  =83  -79   *0   79.5   200   39.8%
  8 Beef 0.3.6     +41  =74  -85   *0   78.0   200   39.0%
  9 Stash 31.0     +26  =98  -76   *0   75.0   200   37.5%

Total Games:    1600
White Wins:      476 (29.8%)
Black Wins:      467 (29.2%)
Draws:           657 (41.1%)
Unfinished:        0 (0.0%)

Estimated elo gain for Clover_2.4
Elo pool : 3192
Clover 2.3.1 : 3132.0
Clover_2.4 : 3207.6
Difference : 75.6
Clover 2.4 : + 75
90% of coding is debugging, the other 10% is writing bugs.
lucametehau
Posts: 100
Joined: Thu Apr 22, 2021 3:56 pm
Location: Bucharest, Romania
Full name: Metehau Luca

Re: GRL - test runs

Post by lucametehau »

Rebel wrote: Thu Jul 15, 2021 8:18 am

Code: Select all

Gambit Rating List
Running      : Gauntlet Clover 2.4
Time Control : Time control 40/120
Games        : 1600

Results from file gauntlet-clover-240.pgn:

No. Name           Win Draw Loss Unf.  Score Games       %
----------------------------------------------------------
  1 Clover 2.4    +511 =657 -432   *0  839.5  1600   52.5%
  2 rofChade 2.3   +73 =100  -27   *0  123.0   200   61.5%
  3 Booot 6.5      +86  =73  -41   *0  122.5   200   61.2%
  4 Berserk 4.3.0  +69  =85  -46   *0  111.5   200   55.8%
  5 Halogen 10     +51  =75  -74   *0   88.5   200   44.2%
  6 Weiss 1.4      +48  =69  -83   *0   82.5   200   41.2%
  7 Wasp 4.50      +38  =83  -79   *0   79.5   200   39.8%
  8 Beef 0.3.6     +41  =74  -85   *0   78.0   200   39.0%
  9 Stash 31.0     +26  =98  -76   *0   75.0   200   37.5%

Total Games:    1600
White Wins:      476 (29.8%)
Black Wins:      467 (29.2%)
Draws:           657 (41.1%)
Unfinished:        0 (0.0%)

Estimated elo gain for Clover_2.4
Elo pool : 3192
Clover 2.3.1 : 3132.0
Clover_2.4 : 3207.6
Difference : 75.6
Clover 2.4 : + 75
Very nice, better than expected! :D
Thanks for testing!
User avatar
Rebel
Posts: 6991
Joined: Thu Aug 18, 2011 12:04 pm

Re: GRL - test runs

Post by Rebel »

Gauntlet Berserk 4.5.0

1400 games.

Elo pool 3288.

http://rebel13.nl/a/grl.htm
90% of coding is debugging, the other 10% is writing bugs.
User avatar
Rebel
Posts: 6991
Joined: Thu Aug 18, 2011 12:04 pm

Re: GRL - test runs

Post by Rebel »

Berkserk 4.5.0 + 59 elo

Code: Select all

    	
    Gambit Rating List
    Running      : gauntlet Berserk 4.5.0
    Time Control : Time control 40/120
    Games        : 1400

    Results from file gauntlet-berserk-450.pgn:

    No. Name            Win Draw Loss Unf.  Score Games       %
    -----------------------------------------------------------
      1 Berserk 4.5.0  +408 =596 -396   *0  706.0  1400   50.4%
      2 Komodo 11       +84  =85  -31   *0  126.5   200   63.2%
      3 Ethereal 12.00  +54  =97  -49   *0  102.5   200   51.2%
      4 Komodo 10       +61  =78  -61   *0  100.0   200   50.0%
      5 Pedone 3.0      +61  =77  -62   *0   99.5   200   49.8%
      6 Nemorino 6.00   +64  =70  -66   *0   99.0   200   49.5%
      7 rofChade 2.3    +38  =91  -71   *0   83.5   200   41.8%
      8 Booot 6.5       +34  =98  -68   *0   83.0   200   41.5%

    Total Games:    1400
    White Wins:      383 (27.4%)
    Black Wins:      421 (30.1%)
    Draws:           596 (42.6%)
    Unfinished:        0 (0.0%)

    Estimated ratings for this elo 3288 pool

       # PLAYER            :  RATING  POINTS  PLAYED   (%)
       1 Komodo 11         :  3385.6   126.5     200    63
       2 Ethereal 12.00    :  3299.2   102.5     200    51
       3 Berserk 4.5.0     :  3290.5   706.0    1400    50
       4 Komodo 10         :  3290.5   100.0     200    50
       5 Pedone 3.0        :  3288.7    99.5     200    50
       6 Nemorino 6.00     :  3287.0    99.0     200    50
       7 rofChade 2.3      :  3232.1    83.5     200    42
       8 Booot 6.5         :  3230.3    83.0     200    42

90% of coding is debugging, the other 10% is writing bugs.
User avatar
Rebel
Posts: 6991
Joined: Thu Aug 18, 2011 12:04 pm

Re: GRL - test runs

Post by Rebel »

Gauntlet Nalwald 1.1.1

1200 games.

http://rebel13.nl/a/grl.htm
90% of coding is debugging, the other 10% is writing bugs.