Automated tuning... finally... (Topple v0.3.0)

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

konsolas
Posts: 182
Joined: Sun Jun 12, 2016 5:44 pm
Location: London
Full name: Vincent

Re: Automated tuning... finally... (Topple v0.3.0)

Post by konsolas »

Thanks Guenther, you seem to be right.
Godel has apparently lost on time in every single one of its tournament matches, and Drosophila has crashed in every one of its tournament matches. I'll have to investigate this further. I've also removed Critter from my engine pool.
Thank you for the advice.
User avatar
CMCanavessi
Posts: 1142
Joined: Thu Dec 28, 2017 4:06 pm
Location: Argentina

Re: Automated tuning... finally... (Topple v0.3.0)

Post by CMCanavessi »

konsolas wrote: Sat Jan 12, 2019 1:50 pm Thanks Daniel,

I've built up a small collection of engines now to run tournaments with so hopefully i can have a more accurate picture of strength improvements in the future:

Code: Select all

Rank Name                          Elo     +/-   Games   Score   Draws
   1 Critter_1.6a_32bit            744     nan     110   98.6%    2.7%
   2 gaviota-1.0-win32             261      82     110   81.8%    7.3%
   3 pawny_1.2.x64.SSE4.2          236      76     110   79.5%   10.0%
   4 ToppleDebug                   114      61     120   65.8%   13.3%
   5 GarboChess2-32                 61      60     110   58.6%   17.3%
   6 orion64-v0.5-bmi2              54      62     110   57.7%   11.8%
   7 Topple2                        29      59     120   54.2%   11.7%
   8 Topple2E                       26      59     120   53.8%   10.8%
   9 simplex-098-32-ja             -77      65     110   39.1%    7.3%
  10 chispa403-blend               -80      65     110   38.6%    6.4%
  11 bikjump                      -241      83     110   20.0%    1.8%
  12 drosophila-win64             -inf     nan     110    0.0%    0.0%
  13 Godel                        -inf     nan     110    0.0%    0.0%

730 of 780 games finished.
Topple2 = Topple v0.2.1, Topple2E = Topple v0.3.1, ToppleDebug = current dev build of Topple.

This reflects CMCanavessi's results (where v0.2.1 was very similar to v0.3.1), but I think there is sufficient evidence to suggest that the current development build is likely to be stronger.
ToppleDebug is looking NICE !!!
Follow my tournament and some Leela gauntlets live at http://twitch.tv/ccls
konsolas
Posts: 182
Joined: Sun Jun 12, 2016 5:44 pm
Location: London
Full name: Vincent

Re: Automated tuning... finally... (Topple v0.3.0)

Post by konsolas »

I'm now trying a gauntlet since the round robin tournament takes far too long.

Results of a 200 game gauntlet with Topple v0.3.1 (drosophila working correctly this time):

Code: Select all

Rank Name                          Elo     +/-   Games   Score   Draws
   0 Topple v0.3.1                   2      45     200   50.2%   13.5%
   1 gaviota-1.0-win32             436     nan      20   92.5%    5.0%
   2 pawny_1.2.x64.SSE4.2          191     173      20   75.0%   20.0%
   3 GarboChess2-32                168     175      20   72.5%   15.0%
   4 orion64-v0.5-bmi2             127     149      20   67.5%   25.0%
   5 danasah680                    108     165      20   65.0%   10.0%
   6 drosophila-win64              -17     139      20   47.5%   25.0%
   7 simplex-098-32-ja             -89     167      20   37.5%    5.0%
   8 Teki2_win64                  -191     194      20   25.0%   10.0%
   9 chispa403-blend              -382     nan      20   10.0%   10.0%
  10 bikjump                      -512     nan      20    5.0%   10.0%

200 of 200 games finished.
I've started a gauntlet for Topple v0.3.2_DEV, which should be finished in a few hours. If results are good, I'll make a new release today or tomorrow.
konsolas
Posts: 182
Joined: Sun Jun 12, 2016 5:44 pm
Location: London
Full name: Vincent

Re: Automated tuning... finally... (Topple v0.3.0)

Post by konsolas »

Relative elo of Topple v0.3.1 was calculated to be 2 ± 45
Relative elo of Topple v0.3.2 was calculated to be 143 ± 48

Topple v0.3.2 has been released: https://github.com/konsolas/ToppleChess ... tag/v0.3.2
Perhaps it will show the strength gain that v0.3.0 was supposed to have :D
User avatar
CMCanavessi
Posts: 1142
Joined: Thu Dec 28, 2017 4:06 pm
Location: Argentina

Re: Automated tuning... finally... (Topple v0.3.0)

Post by CMCanavessi »

konsolas wrote: Sun Jan 13, 2019 10:21 am Relative elo of Topple v0.3.1 was calculated to be 2 ± 45
Relative elo of Topple v0.3.2 was calculated to be 143 ± 48

Topple v0.3.2 has been released: https://github.com/konsolas/ToppleChess ... tag/v0.3.2
Perhaps it will show the strength gain that v0.3.0 was supposed to have :D
Looks very nice! I'll make a new gauntlet in a few days.
Follow my tournament and some Leela gauntlets live at http://twitch.tv/ccls
User avatar
CMCanavessi
Posts: 1142
Joined: Thu Dec 28, 2017 4:06 pm
Location: Argentina

Re: Automated tuning... finally... (Topple v0.3.0)

Post by CMCanavessi »

Well, I just paused a tournament I'm running and started the same gauntlet as with v0.2.1 and v0.3.0 and I can already tell you that this time things are looking much much better... we'll have to wait a bit to see if things stay this way, but looks very promising. I expect around 2575-2600 elo on my rating list, which would put Topple just about 50 elo away from entering my lowest league :)
Maybe next update will bring those 50 elo! :mrgreen:
Follow my tournament and some Leela gauntlets live at http://twitch.tv/ccls
User avatar
CMCanavessi
Posts: 1142
Joined: Thu Dec 28, 2017 4:06 pm
Location: Argentina

Re: Automated tuning... finally... (Topple v0.3.0)

Post by CMCanavessi »

Topple v0.2.1 x64 118.0/240 (49.2%)
Topple v0.3.0 x64 115.5/240 (48.1%)
Topple v0.3.2 x64 41.0/56 (73.2%)

I'm probably going to abort the current gauntlet because it's evidently too weak for v0.3.2, the progress has been huge.

Topple v0.3.2 x64 : 2593.5
Topple v0.2.1 x64 : 2423.6
Topple v0.3.0 x64 : 2412.3
Follow my tournament and some Leela gauntlets live at http://twitch.tv/ccls
konsolas
Posts: 182
Joined: Sun Jun 12, 2016 5:44 pm
Location: London
Full name: Vincent

Re: Automated tuning... finally... (Topple v0.3.0)

Post by konsolas »

CMCanavessi wrote: Sun Jan 13, 2019 8:06 pm Topple v0.2.1 x64 118.0/240 (49.2%)
Topple v0.3.0 x64 115.5/240 (48.1%)
Topple v0.3.2 x64 41.0/56 (73.2%)

I'm probably going to abort the current gauntlet because it's evidently too weak for v0.3.2, the progress has been huge.

Topple v0.3.2 x64 : 2593.5
Topple v0.2.1 x64 : 2423.6
Topple v0.3.0 x64 : 2412.3
Wow, that's really nice! Thanks for testing.

In case you were wondering, v0.3.2 fixed various issues with the middlegame/endgame tapering system which prevented the king safety evaluation from working correctly.
User avatar
CMCanavessi
Posts: 1142
Joined: Thu Dec 28, 2017 4:06 pm
Location: Argentina

Re: Automated tuning... finally... (Topple v0.3.0)

Post by CMCanavessi »

konsolas wrote: Sun Jan 13, 2019 8:18 pm
CMCanavessi wrote: Sun Jan 13, 2019 8:06 pm Topple v0.2.1 x64 118.0/240 (49.2%)
Topple v0.3.0 x64 115.5/240 (48.1%)
Topple v0.3.2 x64 41.0/56 (73.2%)

I'm probably going to abort the current gauntlet because it's evidently too weak for v0.3.2, the progress has been huge.

Topple v0.3.2 x64 : 2593.5
Topple v0.2.1 x64 : 2423.6
Topple v0.3.0 x64 : 2412.3
Wow, that's really nice! Thanks for testing.

In case you were wondering, v0.3.2 fixed various issues with the middlegame/endgame tapering system which prevented the king safety evaluation from working correctly.
Any plans for ponder support?
Follow my tournament and some Leela gauntlets live at http://twitch.tv/ccls
konsolas
Posts: 182
Joined: Sun Jun 12, 2016 5:44 pm
Location: London
Full name: Vincent

Re: Automated tuning... finally... (Topple v0.3.0)

Post by konsolas »

CMCanavessi wrote: Sun Jan 13, 2019 11:28 pm
konsolas wrote: Sun Jan 13, 2019 8:18 pm
CMCanavessi wrote: Sun Jan 13, 2019 8:06 pm Topple v0.2.1 x64 118.0/240 (49.2%)
Topple v0.3.0 x64 115.5/240 (48.1%)
Topple v0.3.2 x64 41.0/56 (73.2%)

I'm probably going to abort the current gauntlet because it's evidently too weak for v0.3.2, the progress has been huge.

Topple v0.3.2 x64 : 2593.5
Topple v0.2.1 x64 : 2423.6
Topple v0.3.0 x64 : 2412.3
Wow, that's really nice! Thanks for testing.

In case you were wondering, v0.3.2 fixed various issues with the middlegame/endgame tapering system which prevented the king safety evaluation from working correctly.
Any plans for ponder support?
It's definitely on the bucket list to implement in the future, but right now I'm focusing on more search improvements. The protocol for pondering also seems a bit confusing.