Re: Idea for recognizing fortress/encouraging progress
Posted: Sun May 05, 2013 9:22 pm
I don't know where you get this idea of a 'simple-minded kludge'.
Computer Chess Club
https://talkchess.com/
I was referring to MY simple minded kludge, the one I described doing a few months back, so don't take it personally. I was not referring to anything you said.hgm wrote:I don't know where you get this idea of a 'simple-minded kludge'.
No, you fully misunderstood me: I wrote that I want to evaluate 100% useless pieces as 0, like the black bishop or the two rooks. In KQK the queen is clearly not useless. And here we are also talking about positions with 8+8 locked pawns only.Evert wrote:Using that logic, KQK is a draw unless you actually give mate.Sven Schüle wrote: Without the queen the example position would be an exact draw. With the queen it remains draw unless the queen is sacrificed. Evaluating as 0 seems to fit best to this situation in my opinion.
Perhaps you are right: 0 may be too harsh due to the queen which is not "fully useless" due to its sacrifice ability. But would't the following problem remain present: the program sees all non-sacrifice positions as "won" but does not actually play the winning sequence, instead moves around its pieces while maintaining the "win"? I think this is a bit like the question whether we should evaluate the first repetition within the search tree (i.e., below the root node) as draw or not. Today we know that opting for draw is better since we avoid useless search. How is that much different from our case here?Evert wrote:The position is not a draw because of the presence of the queen, so evaluating it as such isn't correct.
Yes, I agree, but the position after the sacrifice must get a much better value since it mostly restores the value of the two rooks (but also of the white bishop). How do you match that condition with a smooth scaling approach which also scales down the two rooks significantly, even if the scaling factor is smaller with 7 instead of 8 enemy pawns? Maybe I misunderstood your scaling formula.Evert wrote:What is correct is that the position after the queen sacrifice should be evaluated as better than the position before the sacrifice. That is really the most important point.
I understand what you say, but then your scaling must be very clever to match many possible cases.Evert wrote:Depends on how much you scale by - and you can scale it to 0 if you want. What you need is eval(position after sacrifice) > eval(position before sacrifice). The difference in material value between the two is a queen (going the other way), so the scaling has to be large enough to compensate for that.Scaling down based on the degree of "lockedness" of the position does not appear attractive to me here, it may be very imprecise. In fact it might also apply to a somewhat smaller degree to the position after the queen sac, leading to the wrong decision to prefer the position without sac due to the presence of the queen material.
I only want to "prove" for single pieces that they are 100% useless with the current pawn structure. That can really be done piece by piece. If it applies to all pieces then we get a draw. But please don't mix it up: I want to address the scores for pieces, not for the whole position. For instance, after the queen sacrifice the rook evaluation would immediately change.Evert wrote:Ok, if you can proof that the position is a draw, there is of course no problem with returning a draw score. However, there is a cost associated with that and it's for sure faster to simply scale if the position is "drawish or draw" than it is to distinguish between "drawish" and "draw" and it's not obvious (to me) that the program would play differently.My proposal only works in cases where you can be 100% sure that a piece has zero effect with the current pawn structure. If that does not apply then scaling is of course better.
The reason is that I would only scale if I am not sure. To stay with our example, I would be 100% sure for both rooks and both bishops but maybe I could scale the queen value down. In another case there could be two or more pieces to be scaled (and not necessarily with the same factor). Also scaling would not be applied to kings and pawns. Therefore you can clearly get a different result this way compared to scaling the whole position uniformly.Evert wrote:I don't understand. The position evaluation is the sum of piece-specific terms. How is scaling each of those independently different from just scaling the whole thing?But then I would apply scaling to each single piece (QRBN), not to the whole position value.
Don't think that's the biggest problem. I mean, it requires a very deliberate and implausible sequence of moves, but it doesn't look impossible to me (it'd be a different story if there were a white dark-squared bishop around somewhere).Sven Schüle wrote:Any idea how to reach that position?
Sven Schüle wrote:Any idea how to reach that position?
I believe Komodo does a good job at that. Yes, probably it could do better but where other programs score exactly zero Komodo tries to distinguish an easy draw from a difficult draw.hgm wrote:Usually it is bad to evaluate drawn positions as exactly zero, as there are nearly won draw positions as well as nearly lost draw positions. And it pays to be able to distinguish the two.
I don't understand your argument. It is scores that are exactly zero that lead to endless shuffling, as the pieces will diffuse randomly over the board. And the whole point is to have the leading engine avoid that the position can be closed, not what happens after it is closed. This is why using the 50-move counter to detect it is pointless.
I expect Evert's suggestion to work very well. The drawishness correction is probably only needed with >=6 bloacked Pawns. This need not be very expensive. The Pawn hash can contain a 2-bit field to signal <6, 6, 7 or 8 blocked Pawns. In most positions it would be set to <6, and the overhead remains limited to testing the field. Only when it specifies >= 6 you would call an extra evaluation routine to determine the scaling, based on present piece types.
And if you start in a locked position, and are enough ahead to 'sacrifice yourself out of the cage', the search will find that, and the evaluation will jump up by the sacrifice because the discount is no longer applied, or strongly reduced. Sacrificing Q in the position of the OP will raise the score from +5+5+9 = +19 discounted by a factor 16 (say) for 8 blocked Pawns (so +1.2) to +5+5+1 = +11 discounted by only a factor 1.5 (because only 6 blocked Pawns, an open file, and a strong Rook majority), i.e. +7. A huge incentive for the Q sac.
I would also like to say that a completely locked pawn structure does not define what a closed position is. I would agree that it is a minor subset.hgm wrote:I don't think it is bad if this requires a complex algorithm. As long as it is not invoked needlessly. In my proposal the simplistic head-to-head counting would only be used only as a filter, to cheaply exclude positions where no subtlety is called for. As this goes through Pawns structure, the cost of deciding in a more precise way which structures can be opened easily (by Pawn moves alone) will be negligible anyway, because it is hashed in the Pawn hash with a large hit rate. Only positions with Pawn structures flagged there as 'dangerous' would invoke the complex algorithm. But for such positions that is really worthwhile, because otherwise they would be completely mis-evaluated.
P.S. It is clear that this definition is rather simplistic because pawns do not have to be locked together in the so called "ram" formation to have most of the characteristics of a closed position.Don wrote:I would also like to say that a completely locked pawn structure does not define what a closed position is. I would agree that it is a minor subset.hgm wrote:I don't think it is bad if this requires a complex algorithm. As long as it is not invoked needlessly. In my proposal the simplistic head-to-head counting would only be used only as a filter, to cheaply exclude positions where no subtlety is called for. As this goes through Pawns structure, the cost of deciding in a more precise way which structures can be opened easily (by Pawn moves alone) will be negligible anyway, because it is hashed in the Pawn hash with a large hit rate. Only positions with Pawn structures flagged there as 'dangerous' would invoke the complex algorithm. But for such positions that is really worthwhile, because otherwise they would be completely mis-evaluated.
A very simple algorithm is to count 8 rams as 100% closed, 7 rams as 90% closed and so on. Then you multiply the final score by some coefficient to reflect drawishness. For example in the 8 ram case you could multiply the score by 0.5 to reflect a much larger like-hood of a draw. It seems to make a lot of sense but that did not seem to be effective for us. It also doesn't address the issue of technique except to avoid such positions when you have the advantage but in closed positions there is much more to it. You also want to play them well if you are in those positions.