* vs +
Moderators: bob, hgm, Harvey Williamson
Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
 stegemma
 Posts: 859
 Joined: Mon Aug 10, 2009 8:05 pm
 Location: Italy
 Full name: Stefano Gemma
 Contact:
* vs +
Anybody has try the idea to multiply parameters instead of add? The idea is to use parameters as a probability to win, when some feature occurs. For sample, if I find that having a passed pawn lead to victory in 110% of the times, the value of the position can be multiply by 1.10. If I have a doubled pawn (and maybe some other event occurs) I can multiply positional value by 0.95 and so on. This maybe could give a bigger interrelation between parameters than just adding them.
I would like to try this on the next version of satana, using the random move generator to collect statistics... but I'm afraid that surely dott. Hyatt have already try in '60 before I was born
I would like to try this on the next version of satana, using the random move generator to collect statistics... but I'm afraid that surely dott. Hyatt have already try in '60 before I was born
 hgm
 Posts: 23718
 Joined: Fri Mar 10, 2006 9:06 am
 Location: Amsterdam
 Full name: H G Muller
 Contact:
Re: * vs +
Well, multiplying numbers is equivalent adding their logarithms. So it isn't really different. You just have to apply some logarithmic (i.e. saturating) correction to the evaluation.
Not that it matters very much, though, as any monotonous mapping of the scores will eventually give the same search. It only matters if you compare scores that have a different mapping. Or apply a fixed additive delayedloss bonus to them. This actually makes some silly behavior (like sacrificing a Rook on ply 1 to avoid the loss of a Queen on ply 20) go away.
Not that it matters very much, though, as any monotonous mapping of the scores will eventually give the same search. It only matters if you compare scores that have a different mapping. Or apply a fixed additive delayedloss bonus to them. This actually makes some silly behavior (like sacrificing a Rook on ply 1 to avoid the loss of a Queen on ply 20) go away.
 hgm
 Posts: 23718
 Joined: Fri Mar 10, 2006 9:06 am
 Location: Amsterdam
 Full name: H G Muller
 Contact:
Re: * vs +
Well, multiplying numbers is equivalent adding their logarithms. So it isn't really different. You just have to apply some logarithmic (i.e. saturating) correction to the evaluation.
Not that it matters very much, though, as any monotonous mapping of the scores will eventually give the same search. It only matters if you compare scores that have a different mapping. Or apply a fixed additive delayedloss bonus to them. This actually makes some silly behavior (like sacrificing a Rook on ply 1 to avoid the loss of a Queen on ply 20) go away.
Not that it matters very much, though, as any monotonous mapping of the scores will eventually give the same search. It only matters if you compare scores that have a different mapping. Or apply a fixed additive delayedloss bonus to them. This actually makes some silly behavior (like sacrificing a Rook on ply 1 to avoid the loss of a Queen on ply 20) go away.
Re: * vs +
Multiplication is the same as adding the logarithms, so I think it only changes the scale of representation and not the relative ordering of positions. Unless you want to mix the operators. In a way we are already comparing probabilities, except that we silently map them onto a piece value scale using the (inverse of the) logistic function. In the range where games get won or lost (say, 2p to +2p), this mapping is almost linear. Outside that range it really doesn't matter what you do.stegemma wrote:Anybody has try the idea to multiply parameters instead of add?
[Account deleted]
 stegemma
 Posts: 859
 Joined: Mon Aug 10, 2009 8:05 pm
 Location: Italy
 Full name: Stefano Gemma
 Contact:
Re: * vs +
Ah ok, that's right, I've not think from this point of view. Maybe it could be done if you have to start from the probability of a win, computed at runtime for any parameter in actual position. It is an experiment that I want to do but only hell knows what it leads to... and hell is the house of satana!hgm wrote:Well, multiplying numbers is equivalent adding their logarithms. So it isn't really different. You just have to apply some logarithmic (i.e. saturating) correction to the evaluation.
Not that it matters very much, though, as any monotonous mapping of the scores will eventually give the same search. It only matters if you compare scores that have a different mapping. Or apply a fixed additive delayedloss bonus to them. This actually makes some silly behavior (like sacrificing a Rook on ply 1 to avoid the loss of a Queen on ply 20) go away.
Re: * vs +
Old Yamaha FM chips used this trick so they added logarithms and finally trasnformed the signal back using exptable lookuphgm wrote:Well, multiplying numbers is equivalent adding their logarithms.
So actually they did the opposite to avoid multiplication.

 Posts: 793
 Joined: Sun Aug 03, 2014 2:48 am
 Location: London, UK
 Contact:
Re: * vs +
I am doing exactly that with my engine (Giraffe, based on machine learning).stegemma wrote:Ah ok, that's right, I've not think from this point of view. Maybe it could be done if you have to start from the probability of a win, computed at runtime for any parameter in actual position. It is an experiment that I want to do but only hell knows what it leads to... and hell is the house of satana!hgm wrote:Well, multiplying numbers is equivalent adding their logarithms. So it isn't really different. You just have to apply some logarithmic (i.e. saturating) correction to the evaluation.
Not that it matters very much, though, as any monotonous mapping of the scores will eventually give the same search. It only matters if you compare scores that have a different mapping. Or apply a fixed additive delayedloss bonus to them. This actually makes some silly behavior (like sacrificing a Rook on ply 1 to avoid the loss of a Queen on ply 20) go away.
Probabilities can only be multiplied if they are independent, though, and most won't be.
For example, from an equal position, a queen odd would be worth almost infinity (you are almost surely going to win regardless of what the current eval is). But from a KPPPPvK situation, it would be worth about 1 (doesn't change the probability of winning, since the probability of winning was close to 1 to begin with).
Disclosure: I work for DeepMind on the AlphaZero project, but everything I say here is personal opinion and does not reflect the views of DeepMind / Alphabet.
Re: * vs +
"+" or "*" equal for represent, but "+" much easy calculate for processor.stegemma wrote:Anybody has try the idea to multiply parameters instead of add? The idea is to use parameters as a probability to win, when some feature occurs. For sample, if I find that having a passed pawn lead to victory in 110% of the times, the value of the position can be multiply by 1.10. If I have a doubled pawn (and maybe some other event occurs) I can multiply positional value by 0.95 and so on. This maybe could give a bigger interrelation between parameters than just adding them.
I would like to try this on the next version of satana, using the random move generator to collect statistics... but I'm afraid that surely dott. Hyatt have already try in '60 before I was born
 stegemma
 Posts: 859
 Joined: Mon Aug 10, 2009 8:05 pm
 Location: Italy
 Full name: Stefano Gemma
 Contact:
Re: * vs +
Thanks, I've missed your interesting post and today I've read it and played a little match. Your approach is very interesting, I've try with neural networks in the past, with no valid results, only something similar to random playing.matthewlai wrote:I am doing exactly that with my engine (Giraffe, based on machine learning).stegemma wrote:Ah ok, that's right, I've not think from this point of view. Maybe it could be done if you have to start from the probability of a win, computed at runtime for any parameter in actual position. It is an experiment that I want to do but only hell knows what it leads to... and hell is the house of satana!hgm wrote:Well, multiplying numbers is equivalent adding their logarithms. So it isn't really different. You just have to apply some logarithmic (i.e. saturating) correction to the evaluation.
Not that it matters very much, though, as any monotonous mapping of the scores will eventually give the same search. It only matters if you compare scores that have a different mapping. Or apply a fixed additive delayedloss bonus to them. This actually makes some silly behavior (like sacrificing a Rook on ply 1 to avoid the loss of a Queen on ply 20) go away.
Probabilities can only be multiplied if they are independent, though, and most won't be.
For example, from an equal position, a queen odd would be worth almost infinity (you are almost surely going to win regardless of what the current eval is). But from a KPPPPvK situation, it would be worth about 1 (doesn't change the probability of winning, since the probability of winning was close to 1 to begin with).
I think that a single neural networks is not enough, for chess. Maybe many neural networks should be used, one for any aspect of the game. of course your work is still impressive and I hope that it would be only a starting point for you.

 Posts: 793
 Joined: Sun Aug 03, 2014 2:48 am
 Location: London, UK
 Contact:
Re: * vs +
Yeah it's really all about feature representation. A lot of times the intuitive representation is far from the best. People often choose the most intuitive representation without much thinking, and wonder why neural nets don't work well (and it's really not surprising that they don't work well).stegemma wrote:Thanks, I've missed your interesting post and today I've read it and played a little match. Your approach is very interesting, I've try with neural networks in the past, with no valid results, only something similar to random playing.matthewlai wrote:I am doing exactly that with my engine (Giraffe, based on machine learning).stegemma wrote:Ah ok, that's right, I've not think from this point of view. Maybe it could be done if you have to start from the probability of a win, computed at runtime for any parameter in actual position. It is an experiment that I want to do but only hell knows what it leads to... and hell is the house of satana!hgm wrote:Well, multiplying numbers is equivalent adding their logarithms. So it isn't really different. You just have to apply some logarithmic (i.e. saturating) correction to the evaluation.
Not that it matters very much, though, as any monotonous mapping of the scores will eventually give the same search. It only matters if you compare scores that have a different mapping. Or apply a fixed additive delayedloss bonus to them. This actually makes some silly behavior (like sacrificing a Rook on ply 1 to avoid the loss of a Queen on ply 20) go away.
Probabilities can only be multiplied if they are independent, though, and most won't be.
For example, from an equal position, a queen odd would be worth almost infinity (you are almost surely going to win regardless of what the current eval is). But from a KPPPPvK situation, it would be worth about 1 (doesn't change the probability of winning, since the probability of winning was close to 1 to begin with).
I think that a single neural networks is not enough, for chess. Maybe many neural networks should be used, one for any aspect of the game. of course your work is still impressive and I hope that it would be only a starting point for you.
For example, for chess, many people represent the board as either 64 features each describing a square. Or, slightly but not much better, as a bunch of bitboards.
That works very poorly because neural networks require the feature space to be more or less smooth to work well, and that representation is not smooth at all (smooth here means similar inputs should produce similar outputs, most of the time).
A much better way to represent the board is as a list of piece coordinates.
Taking a single knight as an example (trying to have the network learn the piecesquare table) 
If we represent it as a 64dimensional input with only 1 feature set (where the knight is), it will essentially have to learn a lookup table. This is bad because it won't be able to generalize at all (if it learns that E4, E5, D4 are all good places for a knight to be, it won't be able to generalize that knowledge to D5). There's nothing telling the neural net that the output for D4 is probably much closer to E4 than it is to A1.
On the other hand, a knight can also be represented by which file and rank it's on. That's a much better representation because the piecesq table can be reasonably approximated using a few piecewise linear functions in 2D. So this representation will generalize much better. This is because distance between 2 points on the 2D input space corresponds to distance in eval.
That is the mistake I think most people made when trying to use neural nets to play chess. Unfortunately, trivial feature representations aren't going to work in practice with neural nets, except for very trivial toy problems (and chess is certainly not trivial!). Also unfortunately, many machine learning learning sources don't make that fact abundantly clear.
Using multiple nets is one solution to the issue of requiring different eval for different phases. However, it's got a very high computational cost since a lot of work would have to be done multiple times, so I'm trying to avoid that if possible. If you are interested in multinet approaches, a very cool method is known as Mixture of Experts (http://www.cs.bham.ac.uk/~jxb/INC/l19.pdf), where n nets are trained to produce eval, and another net is trained to decide how to combine their outputs given the position.
The cool thing about that is the whole system can be trained together. You don't have to decide which net is responsible for what kind of positions. As training goes on, the nets will automatically specialize into niches.
Disclosure: I work for DeepMind on the AlphaZero project, but everything I say here is personal opinion and does not reflect the views of DeepMind / Alphabet.