Thanks Guenther, you seem to be right.
Godel has apparently lost on time in every single one of its tournament matches, and Drosophila has crashed in every one of its tournament matches. I'll have to investigate this further. I've also removed Critter from my engine pool.
Thank you for the advice.
Automated tuning... finally... (Topple v0.3.0)
Moderators: hgm, Rebel, chrisw
-
- Posts: 182
- Joined: Sun Jun 12, 2016 5:44 pm
- Location: London
- Full name: Vincent
-
- Posts: 1142
- Joined: Thu Dec 28, 2017 4:06 pm
- Location: Argentina
Re: Automated tuning... finally... (Topple v0.3.0)
ToppleDebug is looking NICE !!!konsolas wrote: ↑Sat Jan 12, 2019 1:50 pm Thanks Daniel,
I've built up a small collection of engines now to run tournaments with so hopefully i can have a more accurate picture of strength improvements in the future:
Topple2 = Topple v0.2.1, Topple2E = Topple v0.3.1, ToppleDebug = current dev build of Topple.Code: Select all
Rank Name Elo +/- Games Score Draws 1 Critter_1.6a_32bit 744 nan 110 98.6% 2.7% 2 gaviota-1.0-win32 261 82 110 81.8% 7.3% 3 pawny_1.2.x64.SSE4.2 236 76 110 79.5% 10.0% 4 ToppleDebug 114 61 120 65.8% 13.3% 5 GarboChess2-32 61 60 110 58.6% 17.3% 6 orion64-v0.5-bmi2 54 62 110 57.7% 11.8% 7 Topple2 29 59 120 54.2% 11.7% 8 Topple2E 26 59 120 53.8% 10.8% 9 simplex-098-32-ja -77 65 110 39.1% 7.3% 10 chispa403-blend -80 65 110 38.6% 6.4% 11 bikjump -241 83 110 20.0% 1.8% 12 drosophila-win64 -inf nan 110 0.0% 0.0% 13 Godel -inf nan 110 0.0% 0.0% 730 of 780 games finished.
This reflects CMCanavessi's results (where v0.2.1 was very similar to v0.3.1), but I think there is sufficient evidence to suggest that the current development build is likely to be stronger.
Follow my tournament and some Leela gauntlets live at http://twitch.tv/ccls
-
- Posts: 182
- Joined: Sun Jun 12, 2016 5:44 pm
- Location: London
- Full name: Vincent
Re: Automated tuning... finally... (Topple v0.3.0)
I'm now trying a gauntlet since the round robin tournament takes far too long.
Results of a 200 game gauntlet with Topple v0.3.1 (drosophila working correctly this time):
I've started a gauntlet for Topple v0.3.2_DEV, which should be finished in a few hours. If results are good, I'll make a new release today or tomorrow.
Results of a 200 game gauntlet with Topple v0.3.1 (drosophila working correctly this time):
Code: Select all
Rank Name Elo +/- Games Score Draws
0 Topple v0.3.1 2 45 200 50.2% 13.5%
1 gaviota-1.0-win32 436 nan 20 92.5% 5.0%
2 pawny_1.2.x64.SSE4.2 191 173 20 75.0% 20.0%
3 GarboChess2-32 168 175 20 72.5% 15.0%
4 orion64-v0.5-bmi2 127 149 20 67.5% 25.0%
5 danasah680 108 165 20 65.0% 10.0%
6 drosophila-win64 -17 139 20 47.5% 25.0%
7 simplex-098-32-ja -89 167 20 37.5% 5.0%
8 Teki2_win64 -191 194 20 25.0% 10.0%
9 chispa403-blend -382 nan 20 10.0% 10.0%
10 bikjump -512 nan 20 5.0% 10.0%
200 of 200 games finished.
-
- Posts: 182
- Joined: Sun Jun 12, 2016 5:44 pm
- Location: London
- Full name: Vincent
Re: Automated tuning... finally... (Topple v0.3.0)
Relative elo of Topple v0.3.1 was calculated to be 2 ± 45
Relative elo of Topple v0.3.2 was calculated to be 143 ± 48
Topple v0.3.2 has been released: https://github.com/konsolas/ToppleChess ... tag/v0.3.2
Perhaps it will show the strength gain that v0.3.0 was supposed to have
Relative elo of Topple v0.3.2 was calculated to be 143 ± 48
Topple v0.3.2 has been released: https://github.com/konsolas/ToppleChess ... tag/v0.3.2
Perhaps it will show the strength gain that v0.3.0 was supposed to have
-
- Posts: 1142
- Joined: Thu Dec 28, 2017 4:06 pm
- Location: Argentina
Re: Automated tuning... finally... (Topple v0.3.0)
Looks very nice! I'll make a new gauntlet in a few days.konsolas wrote: ↑Sun Jan 13, 2019 10:21 am Relative elo of Topple v0.3.1 was calculated to be 2 ± 45
Relative elo of Topple v0.3.2 was calculated to be 143 ± 48
Topple v0.3.2 has been released: https://github.com/konsolas/ToppleChess ... tag/v0.3.2
Perhaps it will show the strength gain that v0.3.0 was supposed to have
Follow my tournament and some Leela gauntlets live at http://twitch.tv/ccls
-
- Posts: 1142
- Joined: Thu Dec 28, 2017 4:06 pm
- Location: Argentina
Re: Automated tuning... finally... (Topple v0.3.0)
Well, I just paused a tournament I'm running and started the same gauntlet as with v0.2.1 and v0.3.0 and I can already tell you that this time things are looking much much better... we'll have to wait a bit to see if things stay this way, but looks very promising. I expect around 2575-2600 elo on my rating list, which would put Topple just about 50 elo away from entering my lowest league
Maybe next update will bring those 50 elo!
Maybe next update will bring those 50 elo!
Follow my tournament and some Leela gauntlets live at http://twitch.tv/ccls
-
- Posts: 1142
- Joined: Thu Dec 28, 2017 4:06 pm
- Location: Argentina
Re: Automated tuning... finally... (Topple v0.3.0)
Topple v0.2.1 x64 118.0/240 (49.2%)
Topple v0.3.0 x64 115.5/240 (48.1%)
Topple v0.3.2 x64 41.0/56 (73.2%)
I'm probably going to abort the current gauntlet because it's evidently too weak for v0.3.2, the progress has been huge.
Topple v0.3.2 x64 : 2593.5
Topple v0.2.1 x64 : 2423.6
Topple v0.3.0 x64 : 2412.3
Topple v0.3.0 x64 115.5/240 (48.1%)
Topple v0.3.2 x64 41.0/56 (73.2%)
I'm probably going to abort the current gauntlet because it's evidently too weak for v0.3.2, the progress has been huge.
Topple v0.3.2 x64 : 2593.5
Topple v0.2.1 x64 : 2423.6
Topple v0.3.0 x64 : 2412.3
Follow my tournament and some Leela gauntlets live at http://twitch.tv/ccls
-
- Posts: 182
- Joined: Sun Jun 12, 2016 5:44 pm
- Location: London
- Full name: Vincent
Re: Automated tuning... finally... (Topple v0.3.0)
Wow, that's really nice! Thanks for testing.CMCanavessi wrote: ↑Sun Jan 13, 2019 8:06 pm Topple v0.2.1 x64 118.0/240 (49.2%)
Topple v0.3.0 x64 115.5/240 (48.1%)
Topple v0.3.2 x64 41.0/56 (73.2%)
I'm probably going to abort the current gauntlet because it's evidently too weak for v0.3.2, the progress has been huge.
Topple v0.3.2 x64 : 2593.5
Topple v0.2.1 x64 : 2423.6
Topple v0.3.0 x64 : 2412.3
In case you were wondering, v0.3.2 fixed various issues with the middlegame/endgame tapering system which prevented the king safety evaluation from working correctly.
-
- Posts: 1142
- Joined: Thu Dec 28, 2017 4:06 pm
- Location: Argentina
Re: Automated tuning... finally... (Topple v0.3.0)
Any plans for ponder support?konsolas wrote: ↑Sun Jan 13, 2019 8:18 pmWow, that's really nice! Thanks for testing.CMCanavessi wrote: ↑Sun Jan 13, 2019 8:06 pm Topple v0.2.1 x64 118.0/240 (49.2%)
Topple v0.3.0 x64 115.5/240 (48.1%)
Topple v0.3.2 x64 41.0/56 (73.2%)
I'm probably going to abort the current gauntlet because it's evidently too weak for v0.3.2, the progress has been huge.
Topple v0.3.2 x64 : 2593.5
Topple v0.2.1 x64 : 2423.6
Topple v0.3.0 x64 : 2412.3
In case you were wondering, v0.3.2 fixed various issues with the middlegame/endgame tapering system which prevented the king safety evaluation from working correctly.
Follow my tournament and some Leela gauntlets live at http://twitch.tv/ccls
-
- Posts: 182
- Joined: Sun Jun 12, 2016 5:44 pm
- Location: London
- Full name: Vincent
Re: Automated tuning... finally... (Topple v0.3.0)
It's definitely on the bucket list to implement in the future, but right now I'm focusing on more search improvements. The protocol for pondering also seems a bit confusing.CMCanavessi wrote: ↑Sun Jan 13, 2019 11:28 pmAny plans for ponder support?konsolas wrote: ↑Sun Jan 13, 2019 8:18 pmWow, that's really nice! Thanks for testing.CMCanavessi wrote: ↑Sun Jan 13, 2019 8:06 pm Topple v0.2.1 x64 118.0/240 (49.2%)
Topple v0.3.0 x64 115.5/240 (48.1%)
Topple v0.3.2 x64 41.0/56 (73.2%)
I'm probably going to abort the current gauntlet because it's evidently too weak for v0.3.2, the progress has been huge.
Topple v0.3.2 x64 : 2593.5
Topple v0.2.1 x64 : 2423.6
Topple v0.3.0 x64 : 2412.3
In case you were wondering, v0.3.2 fixed various issues with the middlegame/endgame tapering system which prevented the king safety evaluation from working correctly.