Sergio Vieri second net is out

Discussion of anything and everything relating to chess playing software and machines.

Moderators: Harvey Williamson, bob, hgm

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
stavros
Posts: 160
Joined: Tue Dec 02, 2014 12:29 am

Re: Sergio Vieri second net is out

Post by stavros » Tue Jul 28, 2020 8:18 pm

Laskos take a look at 28/7/20-2138

comparison https://www.comp.nus.edu.sg/~sergio-v/nnue/

User avatar
cdani
Posts: 2175
Joined: Sat Jan 18, 2014 9:24 am
Location: Andorra
Contact:

Re: Sergio Vieri second net is out

Post by cdani » Tue Jul 28, 2020 8:30 pm

Laskos wrote:
Tue Jul 28, 2020 5:01 pm
cdani wrote:
Tue Jul 28, 2020 4:22 pm
Laskos wrote:
Tue Jul 28, 2020 6:35 am
Now the latest one 20200728-0633.bin seems the strongest, almost outside error margins again in 1000 games. Now there are substantial Elo gains versus old 2141, 10-12 Elo points or so.
Fist I tested a bit the two new nets:

Code: Select all

20+0.1
   # PLAYER                 : RATING  ERROR   POINTS  PLAYED    (%)
   1 stnnue20200728-1442    : 2857.1    2.8   1916.5    3810   50.3%
   2 stnnue20200728-0633    : 2854.9    2.8   1893.5    3810   49.7%
And then I took the possible best against previous best:

Code: Select all

20+0.1
   # PLAYER                 : RATING  ERROR   POINTS  PLAYED    (%)
   1 stnnue20200728-1442    : 2860.3    3.2   1552.0    3030   51.2%
   2 stnnue20200728-0207    : 2851.7    3.2   1478.0    3030   48.8%
So we have a new best, 20200728-1442, and with big margin!!
Wow, hard to keep pace with the improvements. Suddenly the rating increased by some 20 Elo points since the old best 2141 (which was already very strong).
And I think that the latest net
20200729-0335.bin
is 25-30 elo stronger than 20200728-1442!!!!!!!! Testing now.

User avatar
cdani
Posts: 2175
Joined: Sat Jan 18, 2014 9:24 am
Location: Andorra
Contact:

Re: Sergio Vieri second net is out

Post by cdani » Tue Jul 28, 2020 10:06 pm

cdani wrote:
Tue Jul 28, 2020 8:30 pm
And I think that the latest net
20200729-0335.bin
is 25-30 elo stronger than 20200728-1442!!!!!!!! Testing now.
I talked too fast :oops: I'm testing 20200729-0109.bin and 20200729-0335.bin and don't seem to be better than 20200728-1442...

User avatar
MikeB
Posts: 4398
Joined: Thu Mar 09, 2006 5:34 am
Location: Pen Argyl, Pennsylvania

Re: Sergio Vieri second net is out

Post by MikeB » Wed Jul 29, 2020 1:09 am

cdani wrote:
Tue Jul 28, 2020 10:06 pm
cdani wrote:
Tue Jul 28, 2020 8:30 pm
And I think that the latest net
20200729-0335.bin
is 25-30 elo stronger than 20200728-1442!!!!!!!! Testing now.
I talked too fast :oops: I'm testing 20200729-0109.bin and 20200729-0335.bin and don't seem to be better than 20200728-1442...
I think you will find a new leader in 2138 ,replacing 0207. ;)

Also below is the new standard output when running cutechess-cli from a shell script sans the .json file, Will post the the actual script soon on Github. The script below selected 750 positions at random from a 30,000 set of 30,000. What is now unique from games run on cutechess, is that each position selected was a round and that all games for each round used the same starting position. So engine A plays engine B, white and black, then engine A plays engine C, white and black, and then engine B plays engine C, white and black, all using the same position, Thus is not the default behavior for cutechess-cli in fact, it is not even an option for the cutechess-gui.
.

Code: Select all

#########################################################################################################
###                                              Summary                                              ###
#########################################################################################################

PGN File: c:/cluster.mfb/pgn/07281620.pgn
Time Control: base+inc: 20+0.50
Games: 4500
Threads: 1
Hash: 128

Current date : time (EDST)
Date: 07/28/20 : 18:30:58

Projected-> Time: 2h:9m:56s
Run      -> Time: 2h:10m:13s

4500 game(s) loaded
Rank Name     Rating   Δ     +    -     #     Σ    Σ%     W    L    D   W%    =%   OppR
---------------------------------------------------------------------------------------------------------

   1 Ho-2138   3503   0.0    5    5  3000 1521.5  50.7  322  279 2399  10.7  80.0  3498
   2 Ho-0109   3499   3.9    5    5  3000 1495.5  49.9  293  302 2405   9.8  80.2  3500
   3 Ho-0207   3497   2.0    6    6  3000 1483.0  49.4  297  331 2372   9.9  79.1  3501
---------------------------------------------------------------------------------------------------------

  Δ = delta from the next higher rated opponent
  # = number of games played
  Σ = total score, 1 point for win, 1/2 point for draw

LOS:
         Ho Ho Ho
Ho-2138     88 96
Ho-0109  11    73
Ho-0207   3 26

#########################################################################################################
###                                                End                                                ###
#########################################################################################################
Image

User avatar
Ovyron
Posts: 4393
Joined: Tue Jul 03, 2007 2:30 am

Re: Sergio Vieri second net is out

Post by Ovyron » Wed Jul 29, 2020 1:53 am

It's very funny that these nets are being called after the hour they were released, which is very arbitrary.

Just imagine if Sergio had adopted a naming scheme akin to the one used by Emule software, then the conversations would look like this:

"The Radical Squirrel net has been defeated finally by just released Ecstatic Panda net!"
"Yes, Ecstatic Panda is performing even better than Badass Seahorse! By 20 elo or so!"
"I'm currently testing the Frantic Panther net, looking great with a better performance than previous ones!"

:lol:

User avatar
MikeB
Posts: 4398
Joined: Thu Mar 09, 2006 5:34 am
Location: Pen Argyl, Pennsylvania

Re: Sergio Vieri second net is out

Post by MikeB » Wed Jul 29, 2020 2:52 am

MikeB wrote:
Wed Jul 29, 2020 1:09 am
cdani wrote:
Tue Jul 28, 2020 10:06 pm
cdani wrote:
Tue Jul 28, 2020 8:30 pm
And I think that the latest net
20200729-0335.bin
is 25-30 elo stronger than 20200728-1442!!!!!!!! Testing now.
I talked too fast :oops: I'm testing 20200729-0109.bin and 20200729-0335.bin and don't seem to be better than 20200728-1442...
I think you will find a new leader in 2138 ,replacing 0207. ;)

Also below is the new standard output when running cutechess-cli from a shell script sans the .json file, Will post the the actual script soon on Github. The script below selected 750 positions at random from a 30,000 set of 30,000. What is now unique from games run on cutechess, is that each position selected was a round and that all games for each round used the same starting position. So engine A plays engine B, white and black, then engine A plays engine C, white and black, and then engine B plays engine C, white and black, all using the same position, Thus is not the default behavior for cutechess-cli in fact, it is not even an option for the cutechess-gui.
.

Code: Select all

#########################################################################################################
###                                              Summary                                              ###
#########################################################################################################

PGN File: c:/cluster.mfb/pgn/07281620.pgn
Time Control: base+inc: 20+0.50
Games: 4500
Threads: 1
Hash: 128

Current date : time (EDST)
Date: 07/28/20 : 18:30:58

Projected-> Time: 2h:9m:56s
Run      -> Time: 2h:10m:13s

4500 game(s) loaded
Rank Name     Rating   Δ     +    -     #     Σ    Σ%     W    L    D   W%    =%   OppR
---------------------------------------------------------------------------------------------------------

   1 Ho-2138   3503   0.0    5    5  3000 1521.5  50.7  322  279 2399  10.7  80.0  3498
   2 Ho-0109   3499   3.9    5    5  3000 1495.5  49.9  293  302 2405   9.8  80.2  3500
   3 Ho-0207   3497   2.0    6    6  3000 1483.0  49.4  297  331 2372   9.9  79.1  3501
---------------------------------------------------------------------------------------------------------

  Δ = delta from the next higher rated opponent
  # = number of games played
  Σ = total score, 1 point for win, 1/2 point for draw

LOS:
         Ho Ho Ho
Ho-2138     88 96
Ho-0109  11    73
Ho-0207   3 26

#########################################################################################################
###                                                End                                                ###
#########################################################################################################
The exact copies of the cutechess-cli scripts to run this output were just uploaded here:

https://github.com/MichaelB7/cutechess/ ... ojects/cli

Requires A modified version of bayeselo , source of which can be found here :
https://github.com/MichaelB7/bayeselo
Image

User avatar
cdani
Posts: 2175
Joined: Sat Jan 18, 2014 9:24 am
Location: Andorra
Contact:

Re: Sergio Vieri second net is out

Post by cdani » Wed Jul 29, 2020 4:51 am

cdani wrote:
Tue Jul 28, 2020 10:06 pm
cdani wrote:
Tue Jul 28, 2020 8:30 pm
And I think that the latest net
20200729-0335.bin
is 25-30 elo stronger than 20200728-1442!!!!!!!! Testing now.
I talked too fast :oops: I'm testing 20200729-0109.bin and 20200729-0335.bin and don't seem to be better than 20200728-1442...
Finally these two where better but not much:

Code: Select all

20+0.1
   # PLAYER                 : RATING  ERROR   POINTS  PLAYED    (%)
   1 stnnue20200729-0109    : 2857.5    2.5   2101.5    4168   50.4%
   2 stnnue20200728-1442    : 2854.5    2.5   2066.5    4168   49.6%

20+0.1
   # PLAYER                 : RATING  ERROR   POINTS  PLAYED    (%)
   1 stnnue20200729-0335    : 2858.4    2.8   1868.5    3687   50.7%
   2 stnnue20200728-1442    : 2853.6    2.8   1818.5    3687   49.3%

User avatar
Laskos
Posts: 10806
Joined: Wed Jul 26, 2006 8:21 pm
Full name: Kai Laskos

Re: Sergio Vieri second net is out

Post by Laskos » Wed Jul 29, 2020 4:56 am

cdani wrote:
Wed Jul 29, 2020 4:51 am
cdani wrote:
Tue Jul 28, 2020 10:06 pm
cdani wrote:
Tue Jul 28, 2020 8:30 pm
And I think that the latest net
20200729-0335.bin
is 25-30 elo stronger than 20200728-1442!!!!!!!! Testing now.
I talked too fast :oops: I'm testing 20200729-0109.bin and 20200729-0335.bin and don't seem to be better than 20200728-1442...
Finally these two where better but not much:

Code: Select all

20+0.1
   # PLAYER                 : RATING  ERROR   POINTS  PLAYED    (%)
   1 stnnue20200729-0109    : 2857.5    2.5   2101.5    4168   50.4%
   2 stnnue20200728-1442    : 2854.5    2.5   2066.5    4168   49.6%

20+0.1
   # PLAYER                 : RATING  ERROR   POINTS  PLAYED    (%)
   1 stnnue20200729-0335    : 2858.4    2.8   1868.5    3687   50.7%
   2 stnnue20200728-1442    : 2853.6    2.8   1818.5    3687   49.3%
Can confirm 0109 as best for me up to now, but only 1000 games. Haven't checked all of them, it seems the latest 0912 is not stronger.

User avatar
MikeB
Posts: 4398
Joined: Thu Mar 09, 2006 5:34 am
Location: Pen Argyl, Pennsylvania

Re: Sergio Vieri second net is out

Post by MikeB » Wed Jul 29, 2020 6:24 am

Laskos wrote:
Wed Jul 29, 2020 4:56 am
cdani wrote:
Wed Jul 29, 2020 4:51 am
cdani wrote:
Tue Jul 28, 2020 10:06 pm
cdani wrote:
Tue Jul 28, 2020 8:30 pm
And I think that the latest net
20200729-0335.bin
is 25-30 elo stronger than 20200728-1442!!!!!!!! Testing now.
I talked too fast :oops: I'm testing 20200729-0109.bin and 20200729-0335.bin and don't seem to be better than 20200728-1442...
Finally these two where better but not much:

Code: Select all

20+0.1
   # PLAYER                 : RATING  ERROR   POINTS  PLAYED    (%)
   1 stnnue20200729-0109    : 2857.5    2.5   2101.5    4168   50.4%
   2 stnnue20200728-1442    : 2854.5    2.5   2066.5    4168   49.6%

20+0.1
   # PLAYER                 : RATING  ERROR   POINTS  PLAYED    (%)
   1 stnnue20200729-0335    : 2858.4    2.8   1868.5    3687   50.7%
   2 stnnue20200728-1442    : 2853.6    2.8   1818.5    3687   49.3%
Can confirm 0109 as best for me up to now, but only 1000 games. Haven't checked all of them, it seems the latest 0912 is not stronger.
20,000 game set kicked off...

Code: Select all

PGN File: c:/cluster.mfb/pgn/07290204.pgn
Time Control: Time Control-> base+inc: 20+0.50
Games: 20000
Threads: 1
Hash: 128

Current date : time (EDST)
Date: 07/29/20 : 02:13:51

Projected-> Time: 9h:36m:20s
     Run -> Time: 0h:8m:53s

Rank Name     Rating   Δ     +    -     #     Σ    Σ%     W    L    D   W%    =%   OppR 
---------------------------------------------------------------------------------------------------------

   1 Ho-2138   3514   0.0   46   46    96   50.5  52.6   25   20   51  26.0  53.1  3496 
   2 Ho-0109   3511   3.1   46   46    98   51.0  52.0   26   22   50  26.5  51.0  3497 
   3 Ho-0912   3506   5.8   46   46    98   50.0  51.0   24   22   52  24.5  53.1  3499 
   4 Ho-0335   3495  10.7   46   46    98   48.0  49.0   22   24   52  22.4  53.1  3501 
   5 Ho-0629   3474  21.4   47   47    98   44.5  45.4   22   31   45  22.4  45.9  3507 
---------------------------------------------------------------------------------------------------------

  Δ = delta from the next higher rated opponent
  # = number of games played
  Σ = total score, 1 point for win, 1/2 point for draw

LOS:
         Ho Ho Ho Ho Ho
Ho-2138     54 61 74 91
Ho-0109  45    57 71 89
Ho-0912  38 42    64 85
Ho-0335  25 28 35    76
Ho-0629   8 10 14 23   

Updated Results about every 3 minutes via watcher.sh
Results: https://www.dropbox.com/s/r5c6nwn5rgtmlnj/elo.txt?dl=0

PGN File: https://www.dropbox.com/s/bemhmhgrqrtrw ... 4.pgn?dl=0
Image

User avatar
Rebel
Posts: 5670
Joined: Thu Aug 18, 2011 10:04 am

Re: Sergio Vieri second net is out

Post by Rebel » Wed Jul 29, 2020 7:43 am

Laskos wrote:
Wed Jul 29, 2020 4:56 am
Can confirm 0109 as best for me up to now, but only 1000 games. Haven't checked all of them, it seems the latest 0912 is not stronger.
I am running 0109 right now, increased the number of games to 5000. After 500 games a 64% score, that's +98 elo :D

We will see.
90% of coding is debugging, the other 10% is writing bugs.

Post Reply