lucasart wrote:Here's the position reached in a game between my engine DoubleCheck 2.5 and Bikjump 2.1
[d]8/1p4p1/2k2R1p/8/8/8/1r5r/K7 b - - 0 1
This is typically where computers are stupid and humans understand immediately that it's a draw due to the persistent stalemate threat. The white rook can check the black king wherever it goes, and can never be taken due to the resulting stalemate...
Of course DoubleCheck playing black was happy with a score of +10.5 or so. In fact, tournament testers will generally report an incorrect win for black here, because they use a resign feature like: if score is above 700 for 5 moves in a row according to both parties, adjudicate as a win.
I just can't figure out how to make the search understand that kind of problem
any ideas ?
First of all, never use the resign feature if you want completely accurate results. If you "fix" this particular issue there will just be another one such as persistent mate threat, major piece completely out of play, or something else you haven't thought of. These rules are all broken in some sense since the user interface can never know if the chess programs themselves can handle it. We have all seen programs not be able to mate with BNvsK, and I have even seen programs screw up RvsK due to some bug so a resign rule can give a free half point or full point to a program that has not earned it.
An improvement to the resign rule which would handle this specific case and is good common sense anyway is to never adjudicate in the presence of checking moves. If one color is getting "checked to death" does it make sense to give the other side a win? Any user interface or test harness that supports resign rules should have that in some form. Even if it were as simple as never adjudicate when one side is in check. It's very common for the losing side to find a sequence of checks that could morph into a repetition draw or force the winning side to compromise in order to avoid the draw.
As for trying to handle this in a chess program, the problem is that there are other special case situations that a program could have trouble dealing with, this is just one of many. You COULD make a program understand this position with some work, but you would surely weaken it more than you would help it. I guarantee that this particular theme is so rare that it will affect (the outcome) of only 1 game out of thousands.
There may come in a time in the future when programs are hundreds of ELO stronger than they are today. The programs will play so well that issues such as this will stand out as real glitches, one of the few imperfections they have. but that is not today. There are bigger issues than this such as trapped pieces which can NEVER be freed. This goes to the bigger issue of an evaluation function that doesn't reason, it just evaluates. However the search DOES reason and with ever increasing depth issues such as this are resolved - even if it takes a ridiculously high depth for that to happen.