A fair fight

Michael Sherwin · Post by **Michael Sherwin** » Sat Dec 09, 2017 12:56 am

Milos wrote:
syzygy wrote:
Werewolf wrote:But what do you think?
I think that it is completely irrelevant whether some AlphaZero prototype is or is not stronger than a perfectly tuned Stockfish setup.

What is important is that, apparently, the general level of play of current top engines can be reached (and most likely be far exceeded) by an approach to computer chess that is completely different than how all leading engines have worked since Claude Shannon wrote the first paper on computer chess.

And this approach is not just completely different from a programming point of view... it does not even need any programming (apart from the initial programming of the AlphaZero software and a bit of cleverness to adapt it to the rules of chess). They just decide how much hardware they want to throw at a problem, they push a button, and some hours later the thing has programmed itself. That is really superhuman.
It's not as easy as that. The actual question is how many (paid!) man hours have been used to actually develop MCTS, optimize it, optimize training algorithm, find optimal feature set, develop specialized super hardware that was used (BTUs), etc.

I probably will never proceed to the next step with the learning in RomiChess due to my age and health issues so I guess I'll share what the next step would have been. Romi's learning is done post game. The next step would have been to make it real time. My idea is basically to use multiple processors to play thousands of game segments, collect the learned data and then use that data in a 'normal' alpha-beta search. I'll let it up to others to ponder the positives and negatives of that approach.

Another observation is that superior to MCTS would be collecting statistics in very fast A-B searches and doing guided MCTS (which would no longer be MC) but I lack the terminology to say it different. The statistics that would be gathered in the quick A-B searches would be the number of beta cutoffs for each root move and the number of checkmates for both sides. Those statistics would guide the MC search. Combine that with reinforcement learning to change the direction of the MC search and you end up with a much more intelligent investigation of the search space.

Terry McCracken · Post by **Terry McCracken** » Sat Dec 09, 2017 1:08 am

Werewolf wrote:I am staggered anyone thinks Alpha Zero would get over 60%

It will not lose. It learned Super GM chess in only 4 hrs. This is not a chess program, it is a learning, cognitive, neural net.

It mastered Go in 40 hrs., (GM) and became superhuman in 70 hrs.

If it can defeat Stockfish after only 4 hrs. of self play, what can it do after 40 hrs., 400 hrs. etc.?

I think you're comparing apples to oranges.

Werewolf · Post by **Werewolf** » Sat Dec 09, 2017 8:47 pm

Terry McCracken wrote:
Werewolf wrote:I am staggered anyone thinks Alpha Zero would get over 60%

If it can defeat Stockfish after only 4 hrs. of self play, what can it do after 40 hrs., 400 hrs. etc.?

Well that's not really what this poll was about but it's up for debate.

Did they saturate after 4 hours? Some people think they did. If they did saturate it shows chess is a different animal to Go etc.

syzygy · Post by **syzygy** » Sat Dec 09, 2017 9:36 pm

Werewolf wrote:Did they saturate after 4 hours? Some people think they did. If they did saturate it shows chess is a different animal to Go etc.

According to Gian-Carlo Pascutto on the Fishcooking forum, they could have chosen to use a bigger network, which would then have saturated at a higher level.

Milos · Post by **Milos** » Sat Dec 09, 2017 11:27 pm

syzygy wrote:According to Gian-Carlo Pascutto on the Fishcooking forum, they could have chosen to use a bigger network, which would then have saturated at a higher level.

That is highly doubtful since the total size of the weights now is around 4.6GB. Using bigger network means training would become really slow and inference (and also self-play) even slower since they would go over the size of BTU memory (8GB).
Looking at the system now, it seems they achieved maximum with the current hardware (and that is one monstrous hardware).

Dariusz Orzechowski · Post by **Dariusz Orzechowski** » Sun Dec 10, 2017 1:02 am

Milos wrote:
syzygy wrote:According to Gian-Carlo Pascutto on the Fishcooking forum, they could have chosen to use a bigger network, which would then have saturated at a higher level.
That is highly doubtful since the total size of the weights now is around 4.6GB. Using bigger network means training would become really slow and inference (and also self-play) even slower since they would go over the size of BTU memory (8GB).
Looking at the system now, it seems they achieved maximum with the current hardware (and that is one monstrous hardware).

I'm not so sure. It looks like now they used 20 blocks network. In previous paper (about AlphaGo Zero) they showed also results from 40 blocks and the resulting network was stronger. So they in principle could do it with 40 blocks for chess as well. Problem is in resources, they trained 20 blocks network for 3 days, 40 blocks for 40 days and needed around 6x more games. Expected progress in chess wouldn't be probably worth it, I think that they could get maybe +100 elo. When at some point 99% self-played games are draws it may be hard to get progress even for Google. Of course it's also possible that bigger network wouldn't be any better.

Milos · Post by **Milos** » Sun Dec 10, 2017 1:13 am

Dariusz Orzechowski wrote:
Milos wrote:
syzygy wrote:According to Gian-Carlo Pascutto on the Fishcooking forum, they could have chosen to use a bigger network, which would then have saturated at a higher level.
That is highly doubtful since the total size of the weights now is around 4.6GB. Using bigger network means training would become really slow and inference (and also self-play) even slower since they would go over the size of BTU memory (8GB).
Looking at the system now, it seems they achieved maximum with the current hardware (and that is one monstrous hardware).
I'm not so sure. It looks like now they used 20 blocks network. In previous paper (about AlphaGo Zero) they showed also results from 40 blocks and the resulting network was stronger. So they in principle could do it with 40 blocks for chess as well. Problem is in resources, they trained 20 blocks network for 3 days, 40 blocks for 40 days and needed around 6x more games. Expected progress in chess wouldn't be probably worth it, I think that they could get maybe +100 elo. When at some point 99% self-played games are draws it may be hard to get progress even for Google. Of course it's also possible that bigger network wouldn't be any better.

This is quite possible.
4.6GB is straight calculation from 4BTUs int8 FMA bandwidth of 4x92T and 80k evals per second.
Since 92T is a peak performance, in reality it was probably closer to 4GB, so 40 blocks would indeed fit into 8GB of BTU memory assuming they now used 20 blocks.

Dariusz Orzechowski · Post by **Dariusz Orzechowski** » Sun Dec 10, 2017 1:34 am

Milos wrote:This is quite possible.
4.6GB is straight calculation from 4BTUs int8 FMA bandwidth of 4x92T and 80k evals per second.
Since 92T is a peak performance, in reality it was probably closer to 4GB, so 40 blocks would indeed fit into 8GB of BTU memory assuming they now used 20 blocks.

I'm almost sure it was 20 blocks. In previous paper they say that for self-played games they used 1600 simulations and for 20 blocks it took 400 ms. Now they used 800 simulations and it took 200 ms (table S3). I think it's safe to assume that they used the same TPUs in both projects. In current paper they say "Unless otherwise specified, the same algorithm settings, network architecture, and hyper-parameters were used for all three games", so it looks like 20 blocks for chess also.

shrapnel · Post by **shrapnel** » Sun Dec 10, 2017 6:20 am

syzygy wrote:
Werewolf wrote:But what do you think?
I think that it is completely irrelevant whether some AlphaZero prototype is or is not stronger than a perfectly tuned Stockfish setup.

What is important is that, apparently, the general level of play of current top engines can be reached (and most likely be far exceeded) by an approach to computer chess that is completely different than how all leading engines have worked since Claude Shannon wrote the first paper on computer chess.

And this approach is not just completely different from a programming point of view... it does not even need any programming (apart from the initial programming of the AlphaZero software and a bit of cleverness to adapt it to the rules of chess). They just decide how much hardware they want to throw at a problem, they push a button, and some hours later the thing has programmed itself. That is really superhuman.

Finally, a Post by Ronald de Man I agree with.
Very fair-minded, giving AlphaZero the Credit it deserves, unlike most of the old Pros here.

Michael Sherwin · Post by **Michael Sherwin** » Sun Dec 10, 2017 6:59 am

shrapnel wrote:
syzygy wrote:
Werewolf wrote:But what do you think?
I think that it is completely irrelevant whether some AlphaZero prototype is or is not stronger than a perfectly tuned Stockfish setup.

What is important is that, apparently, the general level of play of current top engines can be reached (and most likely be far exceeded) by an approach to computer chess that is completely different than how all leading engines have worked since Claude Shannon wrote the first paper on computer chess.

And this approach is not just completely different from a programming point of view... it does not even need any programming (apart from the initial programming of the AlphaZero software and a bit of cleverness to adapt it to the rules of chess). They just decide how much hardware they want to throw at a problem, they push a button, and some hours later the thing has programmed itself. That is really superhuman.
Finally, a Post by Ronald de Man I agree with.
Very fair-minded, giving AlphaZero the Credit it deserves, unlike most of the old Pros here.

AZ's main advantage is twofold. It's computing power and reinforcement learning. If that much computing power was allotted to SF then the results would have been much better for SF. If SF had reinforcement learning and it was trained up it could play 1,000 elo stronger. You think that is bs?

Robin Smith, now deceased, ran a test RomiChess vs several top engines with the top engines using a truly humongous opening book and Romi's learning turned on. Romi gained 50 elo for every 5,000 games. A million games would not saturate with that humongous opening book. So doing the math, 1,000,000/5000x50 = 10,000 elo gain. This gain would be moderated somehow I'm sure but when would that moderation begin? Still think this is bs? Romi played 20 matches against Glaurung rated 2700+ at the time using the 10 Nunn positions. Match 1 saw Romi scoring 5%. By match 20 Romi scored 95%. That series of matches was 400 games using 10 positions. The performance gain was -512 to +512 = 1024 elo. Bottom line is if SF had reinforcement learning and was trained up it would be 4400 elo at least. Now I hope you guys can begin to understand that SF is far far superior to AZ. It is just that SF does not have Romi's reinforcement learning even after It was made available to them 11 years ago. So if they got trounced by something at least 11 years old it is not the algorithms fault. It is the shortsightedness of the programmers. Does anyone understand, yet?

A fair fight

Who would win?

Re: A fair fight

Re: A fair fight

Re: A fair fight

Re: A fair fight

Re: A fair fight

Re: A fair fight

Re: A fair fight

Re: A fair fight

Re: A fair fight

Re: A fair fight