AlphaZero - Tactactical Abilities

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

Lyudmil Tsvetkov
Posts: 6052
Joined: Tue Jun 12, 2012 12:41 pm

Re: AlphaZero - Tactactical Abilities

Post by Lyudmil Tsvetkov »

hgm wrote:
Lyudmil Tsvetkov wrote:So that, no knowledge in Alpha, it was all outsearching.

It is so funny when people still believe Alpha has achieved some breakthrough. No breakthrough, just tremendous computer power.
So Stockfish was outsearched by an opponent that searched a ~1000 times smaller tree (80kps for AlphaZero vs 70Mnps for Stockfish).

Shouldn't that count as a beakthrough? :?
Lyudmil Tsvetkov wrote:But then, the verdict would have been: "It barely beat SF".
That still does't sound very bad for something that 9 hours earlier had sub-zero Elo, only knew the rules and was never taught anything to improve it...
How can one have sub-zero elo?
Lyudmil Tsvetkov
Posts: 6052
Joined: Tue Jun 12, 2012 12:41 pm

Re: AlphaZero - Tactactical Abilities

Post by Lyudmil Tsvetkov »

syzygy wrote:
Lyudmil Tsvetkov wrote:So they are using evaluation, after all.
Did not they claim their approach has nothing to do with alpha-beta?
Duh?

You have failed the Turing test.
More specifically?
Lyudmil Tsvetkov
Posts: 6052
Joined: Tue Jun 12, 2012 12:41 pm

Re: AlphaZero - Tactactical Abilities

Post by Lyudmil Tsvetkov »

hgm wrote:NN = neural network.

The NN produces an evaluation, in terms of winning probability (or actually score expectation), and move recommendations for searching.

The NN was trained by showing it positions from the self-play games, and the result of that game, for predicting results from patterns in the position. Initially the network was initialized randomly, but since it recognizes many patterns there will always be some that correlate with winning, and these will then be enhaced during the training. What patterns exactly the fully trained network recogizes is completely unknown, and would be very hard to find out, because the network is humongously large.
So, it is using Monte Carlo only during training, but in actual game play plain alpha-beta with evaluation.
Why do they claim it is not alpha-beta then, but Monte Carlo?

But that is the real question: what those patterns are, and how many.

I guess it is obvious, no one can tune more than 1000 good chess knowledge terms, so either they are tuning less, or their patterns are completely meaningless.

But I don't care at all what the patterns of a 2800 engine are, I know that pretty well, looking at an engine like Fruit, for example. Was not Fruit 2800 back a decade ago? Those guys are decade and a half behind in development...
Lyudmil Tsvetkov
Posts: 6052
Joined: Tue Jun 12, 2012 12:41 pm

Re: AlphaZero - Tactactical Abilities

Post by Lyudmil Tsvetkov »

hgm wrote:This is very precisely described in the AG0 paper. The NN has many layers. The first layer breaks up the board in overlapping 3x3 areas, and in each such area 256 patterns are recognized. But then many layers follow, (like 19 or 39), which can recognize 'patterns in the patterns', which in many cases is no doubt used to create patterns of larger area, and eventually along entire board rays.
Those are not patterns, this is just random guessing.
Ras
Posts: 2488
Joined: Tue Aug 30, 2016 8:19 pm
Full name: Rasmus Althoff

Re: AlphaZero - Tactactical Abilities

Post by Ras »

Lyudmil Tsvetkov wrote:Those are not patterns, this is just random guessing.
At first, it is. That's what gets dealt with in the learnign phase. Those guesses that happen to yield good output are enforced while those with bad output are weakened. Over time, the network will generate better output.

Pretty much like what you did in your childhood when you first touched a hot oven plate. The random network output "good idea to touch that" pretty quickly got weakened. ;-)
shrapnel
Posts: 1339
Joined: Fri Nov 02, 2012 9:43 am
Location: New Delhi, India

Re: AlphaZero - Tactactical Abilities

Post by shrapnel »

Ras wrote:Pretty much like what you did in your childhood when you first touched a hot oven plate. The random network output "good idea to touch that" pretty quickly got weakened. ;-)
Nice Analogy, but wasted on Tsvetkov as he doesn't do neural networks, only Alpha-Beta Search !
He probably enjoyed touching the Hot Plate so much that he ended up sitting on it to enjoy it better :lol:
i7 5960X @ 4.1 Ghz, 64 GB G.Skill RipJaws RAM, Twin Asus ROG Strix OC 11 GB Geforce 2080 Tis
User avatar
hgm
Posts: 27808
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: AlphaZero - Tactactical Abilities

Post by hgm »

Lyudmil Tsvetkov wrote:So, it is using Monte Carlo only during training, but in actual game play plain alpha-beta with evaluation.
Why do they claim it is not alpha-beta then, but Monte Carlo?
No, they also used (simulated) Monte-Carlo during the matches. Self-play and matches were largely the same, except that time per move during the matches was much longer in the matches.
But that is the real question: what those patterns are, and how many.
Indeed, I guess everyone would like to know that. But it will be hard to find out, if it can be done at all. Apparently the NN was able to find patterns that allowed it to outsearch Stockfish with only a fraction of the nodes.
I guess it is obvious, no one can tune more than 1000 good chess knowledge terms, so either they are tuning less, or their patterns are completely meaningless.
Most of the patterns will indeed be completely meaningless, and the will then be trained to either alter them into something useful, or ignore them. You have to have such 'spare' capacity in an NN to make it sufficiently general. Perhaps these useless patters would have made all the difference when you had been training it for Go or Draughts.
But I don't care at all what the patterns of a 2800 engine are, I know that pretty well, looking at an engine like Fruit, for example. Was not Fruit 2800 back a decade ago? Those guys are decade and a half behind in development...
But good old Fruit cannot outsearch Stockfish with only 1/10 of the nodes... You still seem to think this is about evaluation. It is not. The breakthrough is selective search that (according to you) causes 3500+ Elo play with only a 2800 Elo evaluation.
Lyudmil Tsvetkov
Posts: 6052
Joined: Tue Jun 12, 2012 12:41 pm

Re: AlphaZero - Tactactical Abilities

Post by Lyudmil Tsvetkov »

Ras wrote:
Lyudmil Tsvetkov wrote:Those are not patterns, this is just random guessing.
At first, it is. That's what gets dealt with in the learnign phase. Those guesses that happen to yield good output are enforced while those with bad output are weakened. Over time, the network will generate better output.

Pretty much like what you did in your childhood when you first touched a hot oven plate. The random network output "good idea to touch that" pretty quickly got weakened. ;-)
Chess is much more complex than touching a hot oven plate.
Lyudmil Tsvetkov
Posts: 6052
Joined: Tue Jun 12, 2012 12:41 pm

Re: AlphaZero - Tactactical Abilities

Post by Lyudmil Tsvetkov »

hgm wrote:
Lyudmil Tsvetkov wrote:So, it is using Monte Carlo only during training, but in actual game play plain alpha-beta with evaluation.
Why do they claim it is not alpha-beta then, but Monte Carlo?
No, they also used (simulated) Monte-Carlo during the matches. Self-play and matches were largely the same, except that time per move during the matches was much longer in the matches.
But that is the real question: what those patterns are, and how many.
Indeed, I guess everyone would like to know that. But it will be hard to find out, if it can be done at all. Apparently the NN was able to find patterns that allowed it to outsearch Stockfish with only a fraction of the nodes.
I guess it is obvious, no one can tune more than 1000 good chess knowledge terms, so either they are tuning less, or their patterns are completely meaningless.
Most of the patterns will indeed be completely meaningless, and the will then be trained to either alter them into something useful, or ignore them. You have to have such 'spare' capacity in an NN to make it sufficiently general. Perhaps these useless patters would have made all the difference when you had been training it for Go or Draughts.
But I don't care at all what the patterns of a 2800 engine are, I know that pretty well, looking at an engine like Fruit, for example. Was not Fruit 2800 back a decade ago? Those guys are decade and a half behind in development...
But good old Fruit cannot outsearch Stockfish with only 1/10 of the nodes... You still seem to think this is about evaluation. It is not. The breakthrough is selective search that (according to you) causes 3500+ Elo play with only a 2800 Elo evaluation.
Is not alpha-beta precisely the same: simulating play-outs, only that the play-outs end somewhere with an heuristic score instead of a known result?
What would be the cardinal change?

Outsearching was due to hardware, not to selective algorithms.
Btw., what kind of advanced algorithms could they apply in a MC search?

There NN is obviously BS, but again, they stress their achievement in the NN and not the search.
Why so?
Ras
Posts: 2488
Joined: Tue Aug 30, 2016 8:19 pm
Full name: Rasmus Althoff

Re: AlphaZero - Tactactical Abilities

Post by Ras »

Lyudmil Tsvetkov wrote:Chess is much more complex than touching a hot oven plate.
Completely irrelevant to the argument, please re-read.