Delayed-loss-bonus discussion goes here

Discussion of chess software programming and technical issues.

Moderators: bob, hgm, Harvey Williamson

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
Post Reply
User avatar
hgm
Posts: 23718
Joined: Fri Mar 10, 2006 9:06 am
Location: Amsterdam
Full name: H G Muller
Contact:

Delayed-loss-bonus discussion goes here

Post by hgm » Fri Sep 28, 2007 7:18 am

bob wrote:
hgm wrote:
bob wrote:I don't like the idea of mucking with the scores. You are essentially saying it is better to win something _now_ than to postpone it a bit. That is contrary to sound practice in chess, where the best idea is to take the material at the most opportune moment, which isn't dependent on where we are in the tree.
And the 'opportune moment' is defined as the moment where you cannot drive up the eventual score any further by postponing the gain...

Which is exactly when this algorithm will decide to cash in. As long as other moves are significantly better, the 1 cP delay bonus does not even compete with the noise.
Not if it works as you explain. You have a capture that you can make now, or in 4 moves. If you do the capture right now, your opponent will get a 3 centipawn advantage. If you push it off and make the capture in 4 moves, and defend against the opponents positional threat, he gets no advantage.

Without mucking with the score, you will defer the capture and end up with a score of 0.01... if you do muck with the score, which makes you make the capture now, you end up in 3 moves with a score of -0.03... that isn't a lot, but if it isn't important, why do I have centipawn resolution in my scoring. Uri's idea eliminated this, because the "urgency" score was less than one centipawn, so that you _never_ let the urgency score push you into making a move that gives up any positional edge that can be preserved by delaying the capture...
The fact that other scores are non-exact, but might change on deeper search, seem all the more reason to apply the 'no-detours' princilple to these other scores. Gains that are nearby have been verified to much larger depth than the same gain very deep in the tree.
His comment is based on the idea that this is unsound play. If you are in a dead drawn position, then trading down is a bad idea if you have more material or some slight winning chances, because it makes your opponent's task of drawing you easier.
What's that got to do with it??? Trading is not a gain. Going to from a dead drawn position to an even more dead drawn position is not a gain. Even a 'winning' trade (like giving 2P for N in KNPPKNP) is not a gain if it is an easier draw after that trade. If your evaluation says it is, it should receive a major overhaul. But don't blame it on the search!
Give me a break. A draw is a draw. A drawn KRP vs KR is no more or less drawn than a KR vs KR. So the search is irrelevant. The evaluation is irrelevant. But if you trick your evaluation to thinking that KRP vs KR is a "better draw" (for the krp side) than KR vs KR, then you have my "swindle mode". It has _zero_ to do with search. It has everything to do with giving my opponent the best opportunity to make a mistake, and mistakes are far easier in KRP vs KR than in KR vs KR. I guess, therefore, that we are somehow talking apples to oranges again, since your comment doesn't fit in here.
But in normal positions, delaying captures or favoring captures is not what humans normally do. I can recall many games annotated by humans where the GM will say "white cashed in and took the pawn too quickly and dissipated his advantage" or something similar. I certainly don't want to rush a capture by 4 plies (+4 centipawns) and to do so give up 3 centipawns of positional score elsewhere, if I can delay the capture a couple of moves and not give up the 3 centipawns at all...
Well, +4 cP is actually 8 plies. Have you ever measured how many CP the score of a position typically changes ply by ply with increasing search depth? It seems to me that 3cP would hardly beat the noise.
You are completely missing the point. Sometimes, by deferring a capture, I can hold on to a positional edge because the piece I would use to capture has a second function that helps me. If I use it up first, then I might not be able to hold on to the positional edge.

I prefer to let the search run around in the tree space and maximize the evaluations returned. In my world, RxR Nc3 Nf6 is the same as Nc3 Nf6 RxR, and trying to artificially favor one over the other, when the positions are absolutely identical, is what I call "mucking with the score".
I am sure there are also many games were a GM "failed to cash in on his better position in time, after which the opponent escaped". Like other eval parameters, the delayed-loss bonus is a tunable device, that allows you to tune your engine to the fine line between being too greedy and being too indecisive. It is perfectly possible to evne make it dependent on search depth, and give smaller bonus if the 'negative' score was obtained by a deeper search.
So now you must be saying _your_ search is no good. I hope mine doesn't find a way to win something and then lose the way as the game progresses. If it does, that is a search problem, not an evaluation problem. And I want to fix the problem where it is, not somewhere else.

Delaying a loss might well be OK. I have a "swindle mode" that delays simplified draws when I am material ahead. But using it to encourage quicker captures is certainly wrong. It would seem to me it will encourage you to trade at every opportunity since the longer you wait the worse the score. I don't want to liquidate the center just because I can, I want to wait and use the tension to help me further whatever plan I am following.
Well, in a symmetric search it works both ways. If delaying a loss is good for one side, the other side will automatically try to speed up the loss to take that advantage away.

From what you say here it seems you didn't get the point at all. Trading down is _not_ a gain. Capturing is only encouraged by this scheme if there is no (or not enough) recapture. Otherwise there is no effect.
Should I show you a position where you trade a pair of rooks, and win a pawn, and lose the game instantly???
As the discussion if it is desirable to use a non-negligible delayed-loss bonus on non-mating scores is really a completely independent issue from the correctness of the alpha-beta implementation for this technique, and how it affects transposition-table probibng/storing, I decided to create a separate thread to continnue it.

User avatar
hgm
Posts: 23718
Joined: Fri Mar 10, 2006 9:06 am
Location: Amsterdam
Full name: H G Muller
Contact:

Re: Delayed-loss-bonus discussion goes here

Post by hgm » Fri Sep 28, 2007 8:51 am

bob wrote:Not if it works as you explain. You have a capture that you can make now, or in 4 moves. If you do the capture right now, your opponent will get a 3 centipawn advantage. If you push it off and make the capture in 4 moves, and defend against the opponents positional threat, he gets no advantage.
The point is that I do not consider a gain of 3 cP a real gain. The average change the scores will suffer on deeper seacrh will typically be larger than that. The probability that the 3cP advantage is truly better as the immediate capture might be only 51%. If you think score differences of ~1cP are significant, it just means that the quantization of scores is too course, and you should switch to a milliPawn scale. Note that Joker actually uses Pawn = 256 internally (because I like round figures. :lol: ).
Without mucking with the score, you will defer the capture and end up with a score of 0.01... if you do muck with the score, which makes you make the capture now, you end up in 3 moves with a score of -0.03... that isn't a lot, but if it isn't important, why do I have centipawn resolution in my scoring.
Well, as I said above: the quantization step would have to be much smaller than the smallest meaningful score difference, or it would start to induce an effect of its own. So I would say: "as score differences of 3cP don't mean anything, we quantize the score in steps of 1cP".
Uri's idea eliminated this, because the "urgency" score was less than one centipawn, so that you _never_ let the urgency score push you into making a move that gives up any positional edge that can be preserved by delaying the capture...
Yes, I understand that. (Anticipating a 20,000-ply deep search seemed like overdoing it a bit, though... :lol: )

But that is exactly the issue of this thread: Is it really wise to do that? I have given several arguments why I think why "urgency" is worth something. So far, they have only been countered with arguments of the type "but it can happen...", "in many games..." etc. All true, but quite meaningless. Of course a lot of things can happen. For almost every pruning / extension / reduction / eval term you can design examples where they backfire. You can even design examples where an (N+1)-ply search comes up with the wrong move, and an Nply search with the good one. But the only meaningful issue is "_how often_ does it work, and _how often_ does it backfire?". Examples of backfiring don't tell us that.

One way to approach this more quantitatively, would be the following: Take a (huge) representitive sample of game positions, and investigate how much the score changes on deepening the search 1 ply. This allows you to determine the average score change and the SD of it. If there is a clear systematic dependence on N of this "deepening error", one could take the statistics for each depth separately.

The next step is now to analize the data for subsets of positions that have a certain absolute score, and have a certain difference between the score (before deepening) and CurEval. (I want to select by absolute score, as I expect totally won games, where a huge material advantage exists, to have a systematic rise in score, as how deeper you search, how larger the imbalance will grow. It might even be a good idea to discriminate by piece vs pawn advantages, as a piece majority is much more rapidly to convert into extra material gain (e.g. by gobbling up Pawns) as a Pawn majority.)

It is my conjecture that one will see that in positions with CurEval << Score, the score is more likely to rise upon deepening than for positions with CurEval >~ Score. If this conjecture proves true, scores for positions with CurEval << Score should be _discounted_ for the anticipated score loss on deepening.
Give me a break. A draw is a draw. A drawn KRP vs KR is no more or less drawn than a KR vs KR. So the search is irrelevant. The evaluation is irrelevant. But if you trick your evaluation to thinking that KRP vs KR is a "better draw" (for the krp side) than KR vs KR, then you have my "swindle mode". It has _zero_ to do with search.
This seems exactly to re-iterate what I said. So you can have your break, you seem to need it! :lol:
It has everything to do with giving my opponent the best opportunity to make a mistake, and mistakes are far easier in KRP vs KR than in KR vs KR. I guess, therefore, that we are somehow talking apples to oranges again, since your comment doesn't fit in here.
I consider preferring a short path to the same node over a long path a search matter. So I did not understand why you are briging up the problem of evaluating games as drawn and "more drawn". Games do not get "more drawn" by cashing a gain, wheter you postpone it or grab it. That would be a very strange concept of "gain". That is what my comment was meant to point out.
You are completely missing the point. Sometimes, by deferring a capture, I can hold on to a positional edge because the piece I would use to capture has a second function that helps me. If I use it up first, then I might not be able to hold on to the positional edge.
Indeed. And sometimes _his_ piece that would be captured can help him to hold on to a positional edge. And his piece is higher valued that our piece (including mobility, position on the board, and all that interactions that would disappear on capture). So which of the two would you think more likely?
I prefer to let the search run around in the tree space and maximize the evaluations returned. In my world, RxR Nc3 Nf6 is the same as Nc3 Nf6 RxR, and trying to artificially favor one over the other, when the positions are absolutely identical, is what I call "mucking with the score".
This seems completely wrong. The two paths, even though they lead to the same, score determining leaf node, are only equivalent if the scores in the leave nodes was 100% certain. Under conditions where scores can chance (due to deepening) there are robust paths and fragile paths. If Nc3 would, on deepening, suddenly turn out to convey a lethal threat that would require immediate reaction (e.g. the Knight can deliver a QR fork on the next move that was beyond the horizon before), for RxR Nc3 Nf6 (i.e. if you would have started with RxR) you would have to abandon the idea of playing Nf6. Going for Nf6 Nc3 RxR (I took the liberty of changing your example a little by forcing everyone to move in his own turn. :lol: ), would force you to abandon the idea of RxR after starting with Nf6, while pondering on Nc3. That hurts _a lot_ more. It is an intrinsically more risky path.
So now you must be saying _your_ search is no good. I hope mine doesn't find a way to win something and then lose the way as the game progresses. If it does, that is a search problem, not an evaluation problem. And I want to fix the problem where it is, not somewhere else.
Indeed, my search is no good. PV changes do happen, sometimes through dramatic fail lows. You must be a very happy man that Crafty already sees everything at 1 ply, and then never has to change its mind. No deepenning anymore, hardly any search needed at all. Just 1-ply plus a good evaluation. That 32-men EGTB works wonders.

Snap! <Dream over! Return to reality....>

The whole concept of eval is a kludge, and as long as we cannot search every branch to checkmate, no search will ever be "any good". But it is not to blame as the eval is feeding it wrong information in virtually every position (apart from the occastonal mate).

With a perfect evaluation a 1-ply "search" (without QS, of course) would lead to perfect play. So why do you bother with minimax, alpha-beta, ID, NMP, LMR and all that stuff. Isn't that all to correct a tiny bit of deficiency in your evaluation? Why don't you stick to a 1-ply search, and solve the problem where it really is, in the evaluation?
From what you say here it seems you didn't get the point at all. Trading down is _not_ a gain. Capturing is only encouraged by this scheme if there is no (or not enough) recapture. Otherwise there is no effect.
Should I show you a position where you trade a pair of rooks, and win a pawn, and lose the game instantly???
And that position would have the the same evaluation (within 2 cP) as one where you did not trade the Rooks and took Pawn 2 moves later?

YES!

By all means, show me that position! Then I will tell you which horrible mistake in your evaluation caused such a disaster as to mistake an instant loss for a draw. What was this again about solving the problem where it lies? :roll:

bob
Posts: 20555
Joined: Mon Feb 27, 2006 6:30 pm
Location: Birmingham, AL

Re: Delayed-loss-bonus discussion goes here

Post by bob » Fri Sep 28, 2007 3:41 pm

hgm wrote:
bob wrote:
hgm wrote:
bob wrote:I don't like the idea of mucking with the scores. You are essentially saying it is better to win something _now_ than to postpone it a bit. That is contrary to sound practice in chess, where the best idea is to take the material at the most opportune moment, which isn't dependent on where we are in the tree.
And the 'opportune moment' is defined as the moment where you cannot drive up the eventual score any further by postponing the gain...

Which is exactly when this algorithm will decide to cash in. As long as other moves are significantly better, the 1 cP delay bonus does not even compete with the noise.
Not if it works as you explain. You have a capture that you can make now, or in 4 moves. If you do the capture right now, your opponent will get a 3 centipawn advantage. If you push it off and make the capture in 4 moves, and defend against the opponents positional threat, he gets no advantage.

Without mucking with the score, you will defer the capture and end up with a score of 0.01... if you do muck with the score, which makes you make the capture now, you end up in 3 moves with a score of -0.03... that isn't a lot, but if it isn't important, why do I have centipawn resolution in my scoring. Uri's idea eliminated this, because the "urgency" score was less than one centipawn, so that you _never_ let the urgency score push you into making a move that gives up any positional edge that can be preserved by delaying the capture...
The fact that other scores are non-exact, but might change on deeper search, seem all the more reason to apply the 'no-detours' princilple to these other scores. Gains that are nearby have been verified to much larger depth than the same gain very deep in the tree.
His comment is based on the idea that this is unsound play. If you are in a dead drawn position, then trading down is a bad idea if you have more material or some slight winning chances, because it makes your opponent's task of drawing you easier.
What's that got to do with it??? Trading is not a gain. Going to from a dead drawn position to an even more dead drawn position is not a gain. Even a 'winning' trade (like giving 2P for N in KNPPKNP) is not a gain if it is an easier draw after that trade. If your evaluation says it is, it should receive a major overhaul. But don't blame it on the search!
Give me a break. A draw is a draw. A drawn KRP vs KR is no more or less drawn than a KR vs KR. So the search is irrelevant. The evaluation is irrelevant. But if you trick your evaluation to thinking that KRP vs KR is a "better draw" (for the krp side) than KR vs KR, then you have my "swindle mode". It has _zero_ to do with search. It has everything to do with giving my opponent the best opportunity to make a mistake, and mistakes are far easier in KRP vs KR than in KR vs KR. I guess, therefore, that we are somehow talking apples to oranges again, since your comment doesn't fit in here.
But in normal positions, delaying captures or favoring captures is not what humans normally do. I can recall many games annotated by humans where the GM will say "white cashed in and took the pawn too quickly and dissipated his advantage" or something similar. I certainly don't want to rush a capture by 4 plies (+4 centipawns) and to do so give up 3 centipawns of positional score elsewhere, if I can delay the capture a couple of moves and not give up the 3 centipawns at all...
Well, +4 cP is actually 8 plies. Have you ever measured how many CP the score of a position typically changes ply by ply with increasing search depth? It seems to me that 3cP would hardly beat the noise.
You are completely missing the point. Sometimes, by deferring a capture, I can hold on to a positional edge because the piece I would use to capture has a second function that helps me. If I use it up first, then I might not be able to hold on to the positional edge.

I prefer to let the search run around in the tree space and maximize the evaluations returned. In my world, RxR Nc3 Nf6 is the same as Nc3 Nf6 RxR, and trying to artificially favor one over the other, when the positions are absolutely identical, is what I call "mucking with the score".
I am sure there are also many games were a GM "failed to cash in on his better position in time, after which the opponent escaped". Like other eval parameters, the delayed-loss bonus is a tunable device, that allows you to tune your engine to the fine line between being too greedy and being too indecisive. It is perfectly possible to evne make it dependent on search depth, and give smaller bonus if the 'negative' score was obtained by a deeper search.
So now you must be saying _your_ search is no good. I hope mine doesn't find a way to win something and then lose the way as the game progresses. If it does, that is a search problem, not an evaluation problem. And I want to fix the problem where it is, not somewhere else.

Delaying a loss might well be OK. I have a "swindle mode" that delays simplified draws when I am material ahead. But using it to encourage quicker captures is certainly wrong. It would seem to me it will encourage you to trade at every opportunity since the longer you wait the worse the score. I don't want to liquidate the center just because I can, I want to wait and use the tension to help me further whatever plan I am following.
Well, in a symmetric search it works both ways. If delaying a loss is good for one side, the other side will automatically try to speed up the loss to take that advantage away.

From what you say here it seems you didn't get the point at all. Trading down is _not_ a gain. Capturing is only encouraged by this scheme if there is no (or not enough) recapture. Otherwise there is no effect.
Should I show you a position where you trade a pair of rooks, and win a pawn, and lose the game instantly???
As the discussion if it is desirable to use a non-negligible delayed-loss bonus on non-mating scores is really a completely independent issue from the correctness of the alpha-beta implementation for this technique, and how it affects transposition-table probibng/storing, I decided to create a separate thread to continnue it.
OK, let's get old stuff out of the way. If you look up the paper "The Cray Blitz Draw Heuristic" you will find a similar idea to how we evaluate mates. When I classified a position as drawn, the score was set to "current ply". I opened a window in the scores so that any score < 0 was left alone. Any score above 0 had 100 millipawns added. SO there were no valid scores between 0 and 99. This protected the "draw scores". The idea was that if you have a draw, take the one that pushes the final draw way off into the future to give your opponent a chance to make a mistake, sort of a predecessor to the current "swindle heuristic" in Crafty with respect to tablebase draws. This has a significant problem in the hash table however. As I would frequently see scores of +.090, which in theory was a "draw in 90" plies. But CB could not search beyond 64 plies. It turns out to be a hash table issue. THe idea worked ok, and even won at least one ACM tournament game most programs would have drawn, but the hash table caused problems since this score depends on the path, but the hash table does not consider that in the signature.

I am still convinced that I want my program to choose the best move, that leads to the best placement of pieces and pawns, using whatever positional and tactical ideas I choose, but I don't want to encourage making a capture as soon as possible, or pushing it off as far as possible. If the "distance" was a tie-breaker, I still would not know which way to go, because often pushing it off is better than rushing into a capture. If you take Uri's idea where this "urgency" bonus is smaller than any possible positional score, then it would only be a tie-breaker and I would not distrust the idea as much. But if I can trade a few centipawns of positional edge just to make a recapture as soon as possible rather than delaying it for a few moves, then I don't like the idea at all. In a dead even game, a few centipawns can be the difference between winning, losing or drawing, particularly if you give 'em away multiple times.

You say it won't matter, but it will. Because for every position where you can take a pawn now or later, and now loses 3 centipawns and later loses nothing, there is a "positional tax" you are paying, just to encourage the capture to be played quicker.

I see no valid reasoning for doing this, having played chess for 50+ years myself. There is just nothing in the chess literature that encourages this. You could use the same reasoning for castling and trying to do it quickly. And I would quote a GM I don't remember who wrote "castle if you want to, or if you must, but do not castle just because you can..." That seems to apply equally well here Make the capture when it is most beneficial, not as soon as possible...

The idea just sounds wrong, since there is no supporting advice from strong players.

User avatar
hgm
Posts: 23718
Joined: Fri Mar 10, 2006 9:06 am
Location: Amsterdam
Full name: H G Muller
Contact:

Re: Delayed-loss-bonus discussion goes here

Post by hgm » Fri Sep 28, 2007 4:08 pm

I do not consider that decisive. Humans think in a way completely different from engines. They do search, but they don't percieve depth in terms of plies. If they have the choice between winningthe exchange in 1 move, or in 5, they will be completely aware that that is the choice they are making, and they will evaluate the position at ply=1 and at ply=9 they are choosing between to the same accuracy. Engines don't do that.

Risk management is innate in Humans. Do you know Human Chess players that play Ng1-h3, when their plan is to move the Knight to f4, and Ne2 is equally possible? If you want to walk to the other side of town, you walk through the good neighborhood, and avoid the bad neighborhood. A sensible precaution. But one that is ultimately alien to minimax...

But you can talk all you like, if it plays better it plays better. Computer Chess is often counter-intuitive.

Uri Blass
Posts: 8586
Joined: Wed Mar 08, 2006 11:37 pm
Location: Tel-Aviv Israel

Re: Delayed-loss-bonus discussion goes here

Post by Uri Blass » Fri Sep 28, 2007 4:59 pm

I agree that there is no valid reasoning for doing this if the advantage of one side is small.

When the advantage is big there may be a valid reasoning to do it
not only because the opponent may miss something but also because you can expect +3.00 pawns at ply=20 to win faster then +3.01 pawns at ply=50.

It may be logical to have delayed bonus as function of the score x and the ply but you need to have always for x>y>0

x+bonus(x,ply)>y+bonus(y,ply)>0 and also
bonus(x,ply)=-bonus(-x,ply)

The simplest function that I can think about is something like
bonus(x,ply)=(0.999^ply)*x-x when x is the advantage in centipawns.

if x is small enough the bonus is rounded to 0 centipawns.

Maybe 0.999 is too high but I think that when testing an idea it is better to start with something that has a small influence and see if it is productive(I did not test the idea so I do not know if it is practically productive).

It is also possible that it is better not to use the constant 0.999 and to replace it by the expression f(|x|) when f(|x|) is smaller when |x| is higher.

You may need to be careful in defining f otherwise the following condition
that I already mentioned is not going to be true.

x+bonus(x,ply)>y+bonus(y,ply) for x>y

Uri

bob
Posts: 20555
Joined: Mon Feb 27, 2006 6:30 pm
Location: Birmingham, AL

Re: Delayed-loss-bonus discussion goes here

Post by bob » Fri Sep 28, 2007 6:18 pm

hgm wrote:I do not consider that decisive. Humans think in a way completely different from engines. They do search, but they don't percieve depth in terms of plies. If they have the choice between winningthe exchange in 1 move, or in 5, they will be completely aware that that is the choice they are making, and they will evaluate the position at ply=1 and at ply=9 they are choosing between to the same accuracy. Engines don't do that.

Risk management is innate in Humans. Do you know Human Chess players that play Ng1-h3, when their plan is to move the Knight to f4, and Ne2 is equally possible? If you want to walk to the other side of town, you walk through the good neighborhood, and avoid the bad neighborhood. A sensible precaution. But one that is ultimately alien to minimax...

But you can talk all you like, if it plays better it plays better. Computer Chess is often counter-intuitive.
So you have some significant test results that show that this works better??? That's the way I accept/reject ideas in Crafty.

User avatar
hgm
Posts: 23718
Joined: Fri Mar 10, 2006 9:06 am
Location: Amsterdam
Full name: H G Muller
Contact:

Re: Delayed-loss-bonus discussion goes here

Post by hgm » Fri Sep 28, 2007 7:50 pm

No, if I had i would have told you already what the optimum value of the bonus actually was. In Joker the bonus is currently 0.4cP, in uMax 1cP (but the uMax eval is very course grained). My suspicion is that this is far below optimum. But Joker is so early in its development that I have other priorities than figuring out if a 10-Elo improvement can be achieved. Unlike you, my testing resources are not unlimited, so I have to set priorities. Optimizing miniscule eval terms when most of the eval is still total crap does not seem a good way to get meaningful results anyway. I'd rather have piece-square tables first.

But I would not take it out until testing unambiguously showed that it was stronger without. As Joker adds a 0-6.4 cP random term to the evaluation of each node (to increase the non-determinism), it seems unlikely that the 0.4 cP bonus would result in a measurable effect other than selecting the fastest mate. (Note that mate scores are not generated by the evaluation.)

jwes
Posts: 778
Joined: Sat Jul 01, 2006 5:11 am

Re: Delayed-loss-bonus discussion goes here

Post by jwes » Sat Sep 29, 2007 2:40 am

I am not clear how this would work in a search. First, it would only apply to scores between alpha and beta, or you might find yourself changing a score from fail high to fail low, especially with fail hard. Also, I don't see where CurEval comes from. It is not normally available in an A-B search. The idea may have value if you think about scores as probability distributions, rather than exact values, e.g. if one line wins a rook three plies before the leaf, and the other wins a rook at the leaf, the firat line may well be better because it is less likely there is a disaster just over the horizon.

User avatar
hgm
Posts: 23718
Joined: Fri Mar 10, 2006 9:06 am
Location: Amsterdam
Full name: H G Muller
Contact:

Re: Delayed-loss-bonus discussion goes here

Post by hgm » Sat Sep 29, 2007 4:08 am

Sure, you have to prevent that an out-of-windo score can ever get back into window by adjusting it afterwards. Perhaps it is good to repeat the exact implementation, as the one I originally gave was in another thread:

Code: Select all

int Search&#40;Alpha, Beta&#41;
&#123;
    int BestScore = -INFINITY;

    HASH_PROBE;

    if&#40;Alpha <= CurEval&#41; Alpha = Alpha - 1; 
    if&#40;Beta < CurEval&#41; Beta = Beta - 1; 

    GENERATE_AND_SEACH_MOVES;

    if&#40;BestScore < CurEval&#41; BestScore = BestScore + 1; 

    HASH_STORE&#40;BestScore&#41;
    return&#40;BestScore&#41;;
&#125;
The two lines just after the hash probe pre-adjust the window bounds, such that aplication of the bonus to a fail low score can never bring it back in window. Adjustment of Beta is not strictly necessary (uMax leaves it out), but is just there for efficiency reasons.

Probability distributions is exactly what eval (or actually QS) scores are. They usually change on deeper search, sometimes dramatically. The PV path that the engine plans, will be searched more deeply when it traverses it at game level. In every node of it you will run the risk that the score changes.

Now the more options someone has, the more likely it becomes that on deeper search he will stumble on something to swing the score in his favor. This is why mobility eval works so well. So it can be expected that in positions where the opponent has superior material (e.g. a Rook in stead of a Knight), the probability that he will find something better just behind the horizon is larger than the probability that you will be able to find something better. His Rook is already marked for destruction, and can be used as a Kamikaze against any other light piece. Your Knight, on the contrary, is already bespoken, as it has at some point to capture that Rook. Only if better targets present themselves you can refrain from that. But the number of targets better than a Rook, are far fewer than the number of targets equal to a light piece. And the Rook is intrinsically more powerful. So the odds for swinging the score are overhelmingly in his favor in positions where you postpone the RxN capture. You must be sure it is worthwhile to take that risk. Otherwise you better pay a few cP insurance...

Post Reply