bob wrote:The question is, where is the improvement? We get a fail high. Or a fail low. How does knowing a more accurate bound help?
Take the root search function for example. After searching the first move (YBW) you spawn a search of the second root move with a null window, i.e. alpha = best_score, beta = best_score + 1. This search returns a score of best_score + 10 (i.e. a fail-soft fail high). Then you're gonna do the full window search to find the actual score of this move, now you spawn a new search with a window of [best_score + 10, +INFINITY] instead of getting a score of best_score + 1 and doing the new search with a window of [best_score + 1, + INFINITY].
The latter is what would happen with fail-hard, the former is what you can do if you use fail-soft. Smaller window, faster search.
That's not to mention the more accurate entries in the TT which can cause some additional hash hits.
How does this not help?
The extra information you have in fail soft it is based on guesses if futility pruning and/or delta pruning is used. They are not real bounds, they are estimated based on safety margins. That's where the danger lies. You may be trusting those not to do re-searches. That is ok in one iteration, but if they get stuck in the HT after being propagated, search could be seriously misled.
As you said, with fail soft you have extra information, the problem is that if you are not _really_ careful you extra information that is not reliable.
Miguel
Stockfish uses both futility pruning and fail-soft according to a quick look I just took there, and it seems to be doing fine for itself...
That has nothing to do with this thread, but incidentally, it proves my point. They had to be _very_ careful. In fact, Tord had been bitten by a bug in the past for not using fail-hard.
A note on pvs research after a zero-window fail high:
If the fail-high score is x >= alpha+1 and x < beta, there is need of a research by widening the window; it should be [x - 1, beta] and not [x, beta] just to catch the rare exception when the exact score is x+1. With [x, beta], it will be a fail-low and will have no pv-line backup info.
Chan Rasjid wrote:A note on pvs research after a zero-window fail high:
If the fail-high score is x >= alpha+1 and x < beta, there is need of a research by widening the window; it should be [x - 1, beta] and not [x, beta] just to catch the rare exception when the exact score is x+1. With [x, beta], it will be a fail-low and will have no pv-line backup info.
Rasjid.
That's not the only valid way to do it. It is more efficient to immediately record a best move with score x and then start the re-search with [x, beta].
If the score was actually x it will fail low but you will already have the correct score in the best move. You won't get a PV, true, so you may want to do it that way if that's important for you.
bob wrote:
All I will add here is that this is _far_ more complicated than you realize. For just one example, out of many (in addition to what MB just posted) is what do you do with lazy evaluation? If the score is way outside the window, and you decide to return before doing the full eval, what do you return?
...
One reason why chess programming is such "addictive" fun is it offers a challenge to get a bug free program.Whether it is fail soft or hard will not make a chess engine less bug free.
Your example of lazy-evaluation has no proper treatment as, in itself, it is a great risk if the instance depends on the current a-b bounds and will be corrupt if used in other revisits to the same node (cannot put into the TT). So only the programmer knows what risks to take and it has nothing to do with fail hard/soft. It is the same with all the other issues that can cause problems.
But each tool determines different kind of things you have to be careful with.
Miguel
The only time chess programming is clear is when there is no pruning, no depth extension/reduction, etc... , just plain alpha-beta as you are well aware of.
bob wrote:
All I will add here is that this is _far_ more complicated than you realize. For just one example, out of many (in addition to what MB just posted) is what do you do with lazy evaluation? If the score is way outside the window, and you decide to return before doing the full eval, what do you return? Surely not just alpha or beta if you want fail-soft to work at all? Do you return the estimated score which can be way off the final mark? When you do a futility prune, what do you return? Reduced depth searches? Null-move searches? fail-soft is not as simple as it seems, and most end up implementing it in a flawed way to work around some of those issues. I've not looked at stockfish, but will try to do so later today when I have time, just to see what they are doing...
As I said earlier in the thread: you return as high or as low a score as you know to be accurate. No more and no less.
Which would be exactly what when you have tons of errors by pruning moves, reducing moves, partially evaluating moves, ignoring positions scores and using material ones, etc. It is not quite so simple as just replacing all the "return (alpha)" or "return (beta)" statements with "return (value)". It's more complicated than that. If you look at Crafty, you will see vestiges of fail-soft, but Crafty is not fail-soft by any measure. I simply didn't like some of the side-effects and decided to take the more straight-forward approach of fail-hard, which removed some artifacts that were giving me issues here and there.
If your lazy evaluation doesn't make mistakes that means you know the safety margin, so use it to get the maximum / minimum possible score, it's just a subtraction. If it makes mistakes then that's out of the scope of fail-soft. But if you're really afraid of what would happen here, you can still use a half-assed fail-soft and return beta or alpha when you do lazy evaluation. That wouldn't negate all the benefit of fail-soft, just some of it.
How can one have a lazy eval that doesn't make mistakes? How can one have forward-pruning that doesn't make mistakes? Or reductions? etc... But storing the mistakes can compound them quickly, rather than letting the mistake die as it is made.
Your lazy evaluation can be sound or unsound depending on the safety margin you use and your evaluation function, I'm sure this is nothing new for you...
As I said before, you don't need to always return the best possible bound. As long as you can return scores higher than beta on fail-highs (many people already do this) and lower than alpha on fail-lows, you will already get some of the benefit of fail-soft. I don't think many would use fail-hard on checkmate/draw scores, so that almost goes without saying...
Maybe it would be a good thing to test on your cluster if you ever feel like it.
rbarreira wrote:Your lazy evaluation can be sound or unsound depending on the safety margin you use and your evaluation function, I'm sure this is nothing new for you...
Of course it is not new. And the more accurate you make it, the less lazy eval exits you get. Until you get it perfectly accurate and give up on lazy eval completely.
As I said before, you don't need to always return the best possible bound. As long as you can return scores higher than beta on fail-highs (many people already do this) and lower than alpha on fail-lows, you will already get some of the benefit of fail-soft.
Maybe it would be a good thing to test on your cluster if you ever feel like it.
I did a couple of years back. Not to say there were no bugs, but fail-soft is not that hard to implement. But it was quite tricky to get all the bugs out. I'll see what I can do to test this when nothing else is in the pipe...