2900 Elo points progress, 10 million games, 330 nets

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

2900 Elo points progress, 10 million games, 330 nets

Post by Laskos »

All wasted in the 30xxx run.


Score of lc0_v19_31214 vs SF_10: 26 - 97 - 77 [0.323] 200
Elo difference: -128.95 +/- 38.69
Finished match

Score of lc0_v19_31542 vs SF_10: 27 - 85 - 88 [0.355] 200
Elo difference: -103.73 +/- 36.38
Finished match

According to my rough order of magnitude estimate and if I am not wrong, ~ 10 MWh consumed, or about $3,000 in average European country.

No other run of Leela had such waste.

And compared to the run 10xxx, not yet very close to it:

Score of lc0_v19_11261 vs SF_10: 99 - 82 - 219 [0.521] 400
Elo difference: 14.77 +/- 22.90
Finished match
jp
Posts: 1470
Joined: Mon Apr 23, 2018 7:54 am

Re: 2900 Elo points progress, 10 million games, 330 nets

Post by jp »

So do you think 11261 is the best net & will remain the best for some time?
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: 2900 Elo points progress, 10 million games, 330 nets

Post by Laskos »

jp wrote: Sun Nov 25, 2018 1:15 pm So do you think 11261 is the best net & will remain the best for some time?
They will lower the LR for 30xxx run, expect real progress. But I didn't get why they allowed 10+ million games to be wasted.
chrisw
Posts: 4313
Joined: Tue Apr 03, 2012 4:28 pm

Re: 2900 Elo points progress, 10 million games, 330 nets

Post by chrisw »

Laskos wrote: Sun Nov 25, 2018 1:54 pm
jp wrote: Sun Nov 25, 2018 1:15 pm So do you think 11261 is the best net & will remain the best for some time?
They will lower the LR for 30xxx run, expect real progress. But I didn't get why they allowed 10+ million games to be wasted.
Arguably, because if the self-play elo was still increasing, then the net weights were still being altered, and there might be or might be if they continued, some generalised learning still to come. Beyond a certain point, with real elo not increasing, then this policy becomes "waiting for a miracle to happen" and in the strange democratic voting mechanism they have when deciding to do or not do XYZ, the weight of noise to stop the run or lower the LR or whatever becomes too large to bear and decision ABC gets taken.
User avatar
CMCanavessi
Posts: 1142
Joined: Thu Dec 28, 2017 4:06 pm
Location: Argentina

Re: 2900 Elo points progress, 10 million games, 330 nets

Post by CMCanavessi »

Laskos wrote: Sun Nov 25, 2018 1:54 pm
jp wrote: Sun Nov 25, 2018 1:15 pm So do you think 11261 is the best net & will remain the best for some time?
They will lower the LR for 30xxx run, expect real progress. But I didn't get why they allowed 10+ million games to be wasted.
Because they are doing exactly what they created 30xxx for: testing crap, weird things, new ideas, etc... the goal isn't and never was to get the strongest net from this run, that may happen when all the tests are done and the net is restarted "for real".
Follow my tournament and some Leela gauntlets live at http://twitch.tv/ccls
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: 2900 Elo points progress, 10 million games, 330 nets

Post by Laskos »

CMCanavessi wrote: Sun Nov 25, 2018 2:49 pm
Laskos wrote: Sun Nov 25, 2018 1:54 pm
jp wrote: Sun Nov 25, 2018 1:15 pm So do you think 11261 is the best net & will remain the best for some time?
They will lower the LR for 30xxx run, expect real progress. But I didn't get why they allowed 10+ million games to be wasted.
Because they are doing exactly what they created 30xxx for: testing crap, weird things, new ideas, etc... the goal isn't and never was to get the strongest net from this run, that may happen when all the tests are done and the net is restarted "for real".
Well, not my way of doing things, and often if a scientist is doing this way the research, he won't get priority or even get published. These 20xxx and 30xxx "testruns" are running for already 3 months. More resources were used than in DeepMind A0 project, not being at all near A0 level strength with 20xxx and 30xxx nets.

The incipient Lc0 project was EXTREMELY successful with initial 6x64 smallnets. With currently available hardware resources, a smallnet 6x64 could be optimized in 12 hours, being only 100-150 Elo points weaker than the current state of 30xxx. Then, building on that, an optimized 10x128 net could have been built in less than two days, and then an optimized 15x192 in some 4 days. Reasonable guess is that already this 15x192 net would have been the strongest net overall (better than 10xxx bignet run). And the last step, the optimized 20x256 bignet could have been built in some 7 days. It would probably get close to A0 level. All in all, some two weeks to get to A0 level (or close to it). Again, this Lc0 silliness is already lasting 3 months, their only good luck is that the field is not competitive and nobody cares about reproducing the A0 paper aside them. I guess not a single competitive scientist is among the devs.
User avatar
CMCanavessi
Posts: 1142
Joined: Thu Dec 28, 2017 4:06 pm
Location: Argentina

Re: 2900 Elo points progress, 10 million games, 330 nets

Post by CMCanavessi »

More resources were used than in DeepMind A0 project
How can you know that? We only know (barely) AlphaZero's final run, we don't know how long they were testing things before, we don't know how many runs they did, we know nothing at all about the development, testing, bug fixes, experimentation, etc.
Follow my tournament and some Leela gauntlets live at http://twitch.tv/ccls
whereagles
Posts: 565
Joined: Thu Nov 13, 2014 12:03 pm

Re: 2900 Elo points progress, 10 million games, 330 nets

Post by whereagles »

DeepMind could be lying. But I'll accept their data as true by default.
crem
Posts: 177
Joined: Wed May 23, 2018 9:29 pm

Re: 2900 Elo points progress, 10 million games, 330 nets

Post by crem »

1. So far in our tests we fail to reach A0 strength given the same number of training games (44 million).
2. We don't know why it is. Maybe we have a bug (probably), maybe we use wrong FPU, maybe we guessed wrong Cpuct, maybe we understood the paper incorrectly, maybe we don't shuffle training games good enough, maybe we release new network too rarely, maybe something else.
3. I agree that the best (or rather the only) way to get consistent improvements is to run lots of small tests with different ideas.
4. Currently the way to do such tests is not developed (it's discussed for 6 months already, but it's constantly being preempted by more urgent tasks [rushing release for some CCCC/TCEC season, or changing Lc0 so that new features could be added in more elegant way, or implementing some new Lc0 feaure myself because it's more fun]).
5. Without implemented easy way of testing, setting up and running a fresh tests is a cumbersome task. Especially if it requires engine changes, then currently it takes weeks to roll it out. Server-side part is not one-click thing either, requires some hours of wiring up training scripts, data transfer, typing some SQL, making sure that clients still not send training data from old test after restart etc.
6. Often things are not changed just because the changes needed for a new idea are not implemented yet. Or sometime it's because all devs are too busy with their non-Lc0 life for a week or two etc.
7. Yes, current use of contributors' GPU is not optimal. But to make it more optimal, things have to be implemented, and devs just cannot keep up.
8. Current idea (from my perception) is "We'll do testing properly (on many small-scale experiments that anyone can submit, and statistically sound conclusions) when we have a framework. Until that's ready, let's run full-size test with intuitively guessed params/ideas and hope it will be stronger that everything we had before."

So, yes we fail to reach A0 level, yes we should run well designed experiments, yes we should have done lots of them, yes they should be small and frequent instead of rare and large (and largely based on just intuitive guess instead of some scientifically sound method), but there's really no infrastructure and very little dev time to implement this infrastructure. And even for doing it manually, idea of starting a new small test every week is too time-consuming.

I totally agree that if some team of 2-3 full time developers would appear, they would leave LCZero project behind within one month. I don't know what to do with that knowledge though.

PS. For "More resources were used than in DeepMind A0 project, not being at all near A0 level strength with 20xxx and 30xxx nets." I hope you mean one run of DM vs one run of Lc0. For total amount of resources (for trial and testing), I'm sure DeepMind used hundreds if not thousands times more resources than we did so far.
User avatar
MikeB
Posts: 4889
Joined: Thu Mar 09, 2006 6:34 am
Location: Pen Argyl, Pennsylvania

Re: 2900 Elo points progress, 10 million games, 330 nets

Post by MikeB »

Laskos wrote: Sun Nov 25, 2018 10:46 am All wasted in the 30xxx run.


Score of lc0_v19_31214 vs SF_10: 26 - 97 - 77 [0.323] 200
Elo difference: -128.95 +/- 38.69
Finished match

Score of lc0_v19_31542 vs SF_10: 27 - 85 - 88 [0.355] 200
Elo difference: -103.73 +/- 36.38
Finished match

According to my rough order of magnitude estimate and if I am not wrong, ~ 10 MWh consumed, or about $3,000 in average European country.

No other run of Leela had such waste.

And compared to the run 10xxx, not yet very close to it:

Score of lc0_v19_11261 vs SF_10: 99 - 82 - 219 [0.521] 400
Elo difference: 14.77 +/- 22.90
Finished match
+1
That's a real shame.
Image