Why minimax is fundamentally flawed
Moderators: hgm, Rebel, chrisw
-
- Posts: 411
- Joined: Thu Dec 30, 2010 4:48 am
Re: Why minimax is fundamentally flawed
You could reduce leaf node positional evals as you back up as well beyond a certain threshold. There's not really a problem with say 2.00 meaning 2.05 in 5 moves. It might help a little with seeking progress in a timely manner when combined with hash grafting provided the local optimum you're stuck in is not too steep.
-
- Posts: 5577
- Joined: Tue Feb 28, 2012 11:56 pm
Re: Why minimax is fundamentally flawed
Feel free to come up with an alternative to the "fundamentally flawed" minimax approach that comes close in performance to current engines.hgm wrote:That is a daring claim, for something you have not tested and goes against logic at the same time...syzygy wrote:Theoretically there may be cases where it goes wrong. In practice this seems to occur so rarely that I am confident that any "solution" (replace the minimax principle by somthing else?) would negatively affect playing strength.
In the meantime, I am interested in examples of top engines falling into this trap.
-
- Posts: 2929
- Joined: Sat Jan 22, 2011 12:42 am
- Location: NL
Re: Why minimax is fundamentally flawed
If I recall correctly, you do need to change the alpha/beta bounds at the top of the search, right (I forgot which way it goes, but I can probably reconstruct that if I think about it for more than 10 seconds)?
Otherwise, this is just the same sort of adjustment you do for a mate score, right?
Otherwise, this is just the same sort of adjustment you do for a mate score, right?
-
- Posts: 27894
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: Why minimax is fundamentally flawed
I already did. Years ago, in fact. It is called 'delayed-loss bonus'...syzygy wrote:Feel free to come up with an alternative to the "fundamentally flawed" minimax approach that comes close in performance to current engines.
-
- Posts: 27894
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: Why minimax is fundamentally flawed
Indeed, whenever you adjust a score afterwards, you must make sure it cannot enter the search window (being mistaken for an exact score, or a bound of the opposite type as it really was). So you have to imply the inverse operation to the window bounds. If all scores < curEval will be incremented by one point, you will have to decrement all window bounds that were negative by one point. So that a score that will be brought in-window by the increment, will already have been searched within the window initially.Evert wrote:If I recall correctly, you do need to change the alpha/beta bounds at the top of the search, right (I forgot which way it goes, but I can probably reconstruct that if I think about it for more than 10 seconds)?
Otherwise, this is just the same sort of adjustment you do for a mate score, right?
So Fairy-Max has
Code: Select all
Search(alpha, beta)
{
int curEval = Eval();
if(alpha < curEval) alpha--;
if(beta <= curEval) beta--;
// usual Search code
...
if(bestScore < curEval) bestScore++;
return bestScore;
}
-
- Posts: 589
- Joined: Tue Jun 04, 2013 10:15 pm
Re: Why minimax is fundamentally flawed
Coincidentally, I had that in Rookie v2, 1997-1998 era. I don't have it in the v3 rewrite anymore because it costs several hard to predict branches per node (and it is difficult to understand). In fact, I dropped 'beta' from cut(), all() and qsearch(), and effectively just pass ~alpha to the children.hgm wrote:I already did. Years ago, in fact. It is called 'delayed-loss bonus'...syzygy wrote:Feel free to come up with an alternative to the "fundamentally flawed" minimax approach that comes close in performance to current engines.
In v3, in regular positions, I don't observe the horizon problems that it helped solve in v2, I believe because v3 has far better evaluation. The end games with few pieces I intend to solve with DTZ tables, and after that remove the mopup evaluation. For next week I will just dump all of KRK into the opening book, as I don't feel like touching the engine on such short notice... Just in case it has to play out KRK against Spartacus next week...
Regarding resigning not gaining elo: for me it is a matter of courtesy. Online, the opponent is more likely to play more games if you resign on time, and when you can offer and accept draws at reasonable times. More games means more learning. Learning caused Rookie to have a book exit of +2.8 against Spartacus yesterday. With Black it might have been a draw otherwise. At least one player (not me...) in the Leiden tournaments now has a "Joker mode" that speeds up the moves in a won position, because he got upset having to play out a long and obvious win all the way to checkmate
[Account deleted]
-
- Posts: 6401
- Joined: Thu Mar 09, 2006 8:30 pm
- Location: Chicago, Illinois, USA
Re: Why minimax is fundamentally flawed
We had a discussion a long while ago about this. That is what I do in Gaviota with the checkmates, and that is the only way I have even done it. Then, there is no need to manipulate whatever is in the hash table at all. I was always surprised that not everybody did it that way, but I guess it is a matter of perspective.hgm wrote:Indeed, whenever you adjust a score afterwards, you must make sure it cannot enter the search window (being mistaken for an exact score, or a bound of the opposite type as it really was). So you have to imply the inverse operation to the window bounds. If all scores < curEval will be incremented by one point, you will have to decrement all window bounds that were negative by one point. So that a score that will be brought in-window by the increment, will already have been searched within the window initially.Evert wrote:If I recall correctly, you do need to change the alpha/beta bounds at the top of the search, right (I forgot which way it goes, but I can probably reconstruct that if I think about it for more than 10 seconds)?
Otherwise, this is just the same sort of adjustment you do for a mate score, right?
So Fairy-Max has
This converts the mate score to DTM as a side effect (as mate scores are always larger than curEval and mated scores always smaller). It causes mate-distance pruning as another side effect, as when a mate is already found in another branch, setting alpha and beta, the pre-decrement of beta will push it below -INF in branches that reach beyond the already found mate, so that the initialization bestScore = -INF already causes a beta cutoff.Code: Select all
Search(alpha, beta) { int curEval = Eval(); if(alpha < curEval) alpha--; if(beta <= curEval) beta--; // usual Search code ... if(bestScore < curEval) bestScore++; return bestScore; }
Miguel
-
- Posts: 6401
- Joined: Thu Mar 09, 2006 8:30 pm
- Location: Chicago, Illinois, USA
Re: Why minimax is fundamentally flawed
Exactly. (blue color is mine).hgm wrote:Actually it is exactly the opposite. This is not really path-dependent info. That is, it does not depend on the path through which you reach the node from the root. What you hash is the path-independent eval score from the leaves, corrected according to the path between the leaves and the current node.
That is an intrinsic property of the position; each time you would encounter that same position, you would sprout the same sub-tree from it, which would apply the same correction to the score that was backed up from the leaf to the current node. There would never be any consistency. The inconsistencies occur when you start including info on the path between the root and the note (like distance to the root).
Micro-Max relies for the scoring of the mate distance entirely on the delayed-loss bonus. (On the path to a mate the score is of course always much higher than the current evaluation, so every move the opponent can delay the mate gives him the bonus, counting the length of the path.) There never is any need to correct what you store or retrieve from the hash table. It is automatically correct, no matter in what level of the tree you would hit on it.
Miguel
-
- Posts: 27894
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: Why minimax is fundamentally flawed
Actually I don't think it costs any branches at all. The way Fairy-Max writes it is different from the code I gave above for educational purposes. What it really does is:mvk wrote:Coincidentally, I had that in Rookie v2, 1997-1998 era. I don't have it in the v3 rewrite anymore because it costs several hard to predict branches per node (and it is difficult to understand).
Code: Select all
alpha -= (alpha < curEval);
beta -= (beta <= curEval);
...
return bestScore + (bestScore < curEval);
Personally I have always found it easier to understand than other methods. Future gains are devaluated, just as in my bank account.
In Usurpator II the delayed-loss bonus was not just a point, but a hefty 12.5% of the score. As mate was worth about 40, that meant it would only postpone a checkmate 1 ply to grab a Rook. Of course I was already very happy if it could see a mate in 3, so the discount was not applied very often along the branch! The problem was of course that when you add an easy 1/8 bonus to the score, you have to pre-correct the search window by multiplying with a non-trivial 8/7. And the rounding involved in my approximations to calculate that fast would of course cause the most horendous oversights, because they did in fact allow upper bounds to be corrected to scores that would be considered exact. I really had to make absolutely sure that 8/7*alpha was always rounded down and 8/7*beta always rounded up. In the end I got so fed up with it that I just multiplied it with 5/4, I think.
Code: Select all
PAT jsr EVAL
jmp RET
CUTOFF ldx MAXVAL
ldy MAXVAL+1
RET stx DYNVAL
sty DYNVAL+1
bit DYNVAL+1
bpl STS2
tya
lsr a
ror DYNVAL
lsr a
ror DYNVAL
lsr a
ror DYNVAL
ora #$E0
tay
txa
sec
sbc DYNVAL
sta DYNVAL
lda DYNVAL+1
sty DYNVAL+1
sbc DYNVAL+1
sta DYNVAL+1
STS2 rts
-
- Posts: 2929
- Joined: Sat Jan 22, 2011 12:42 am
- Location: NL
Re: Why minimax is fundamentally flawed
Thinking about it, I wonder if this is really a problem with minimax. We use minimax to find the best move in a position, and we define the best move to be the one that leads to a position with the best score (for us, obviously).
So far so good, but we haven't defined what we mean by "best score". If we define it to be "highest score at a leaf node", that's typically what everyone uses, but already for mate scores it's clear that this is not the best way to do it. A better definition is "the best move is the move that leads to the best score as quickly as possible", and that is what the delay bonus represents. That's not so much correcting a flaw in minimax as in our definition of "best move" and "best score".
Probably doesn't make much difference in practice though.
So far so good, but we haven't defined what we mean by "best score". If we define it to be "highest score at a leaf node", that's typically what everyone uses, but already for mate scores it's clear that this is not the best way to do it. A better definition is "the best move is the move that leads to the best score as quickly as possible", and that is what the delay bonus represents. That's not so much correcting a flaw in minimax as in our definition of "best move" and "best score".
Probably doesn't make much difference in practice though.