Another Impossible Endgame For Engines

Stephen Ham · Post by **Stephen Ham** » Sun Apr 04, 2010 4:19 am

Dear gents,

Here is another position that my 64-bit mp engines with 5-man tablebases claim is a White victory, yet I think it's surely a draw.

FEN: 8/6k1/6p1/8/n2K3B/7P/6P1/8 w

I showed it to a buddy, also a strong chess player, who initially claimed that White must be better because of:
1) the extra pawn
2) bishop versus knight ending
3) remoteness of Black's knight
4) passivity of Black's king.

I said I think Black draws. White only wins if he can capture Black's g-pawn. But his bishop is the wrong color. This seems a simple endgame yet the best engines claim White wins.

I understand that programmers don't want to build too much endgame evaluation knowledge into their chess engines because it adversely affects the search. But it would be nice if programmers would give us: 1) program X, and 2) endgame program X. The latter is to be used exclusively for endgames.

All the best,
Steve

Graham Banks · Post by **Graham Banks** » Sun Apr 04, 2010 4:47 am

[D] 8/6k1/6p1/8/n2K3B/7P/6P1/8 w

Houdini · Post by **Houdini** » Sun Apr 04, 2010 11:53 am

Stephen Ham wrote:Here is another position that my 64-bit mp engines with 5-man tablebases claim is a White victory, yet I think it's surely a draw.

Who is claiming a White victory? If the engine at very large search depth doesn't manage to improve its initial evaluation (probably between +1 and +1.5), it is in fact claiming a draw.
The engine is telling you exactly what's going on: White has the advantage but apparently it's not enough to convert to a win.

Robert Flesher · Post by **Robert Flesher** » Sun Apr 04, 2010 2:40 pm

New game
8/6k1/6p1/8/n2K3B/7P/6P1/8 w - - 0 1

Analysis by Junior 2010 UCI:

1.g4 Nb2 2.Ke5
± (0.94) Depth: 9 00:00:00
1.g4 Nb2 2.Ke5
± (0.94) Depth: 9 00:00:00
1.g4 Nb2 2.Ke5
± (0.94) Depth: 9 00:00:00
1.g4 Nb2 2.Ke5
± (0.94) Depth: 9 00:00:00
1.g4 Nb6 2.Ke5 Kf7 3.Bg5
± (0.89) Depth: 9 00:00:00 3kN
1.Bg5 Kf7 2.h4 Nb6 3.Kc5 Nd7+ 4.Kd6
± (0.92) Depth: 9 00:00:00 13kN
1.Bg5 Nb2 2.h4 Nd1 3.Ke5 Nf2 4.Kd6
± (0.98) Depth: 12 00:00:00 51kN
1.Bg5 Nb2 2.h4 Nd1 3.g3 Nf2 4.Ke3 Nd1+ 5.Ke4 Nf2+ 6.Kd5 Kf7
± (0.90) Depth: 15 00:00:00 98kN
1.g4 Nb2 2.Be1 g5 3.Ba5 Kf6 4.Bc3 Nd1 5.Bd2
± (1.11) Depth: 15 00:00:00 228kN
1.g4 Nb6 2.Bd8 Nd7 3.h4 Nf8 4.Kd5 Nh7 5.g5 Nf8
± (1.06) Depth: 18 00:00:00 1251kN
1.g4 Nb2 2.Bf2 g5 3.Bg1 Na4 4.Ke5 Kh6 5.Be3 Kg6 6.Bc1
± (1.04) Depth: 21 00:00:01 3012kN
1.g4 Kf7 2.Bd8 Ke6 3.h4 Kd6 4.Bf6 Kc6 5.Bh8 Nb6 6.Be5 Nd7
± (0.94) Depth: 24 00:00:04 11809kN
1.g4 Kf7 2.Bd8 Ke6 3.Ba5 Kd6 4.h4 Ke6 5.Bc7 Kd7 6.Bh2
± (0.94) Depth: 26 00:00:10 25100kN
1.g4 Kf7 2.Bd8 Ke6 3.Ba5 Kd6 4.h4 Ke6 5.Bc7 Kd7 6.Bh2 Nb6 7.Be5 Kc6 8.h5
± (0.94) Depth: 27 00:00:33 75193kN
1.Be7 Kh6 2.Ke5 Nb2 3.Ke4 Na4 4.g4
± (0.95) Depth: 27 00:00:46 103207kN
1.Be7 Kh6 2.Ke5 Nb2 3.Ke4 Na4 4.g4 Kg7 5.h4 Kf7 6.Ba3 Ke6 7.Kf4 Kf6 8.Bc1 Nc5 9.Be3 Nd3+ 10.Ke4
± (0.95) Depth: 28 00:00:53 117192kN
1.Be7 Kh6
± (0.95) Depth: 29 00:00:58 128558kN

(, Microsoft 04.04.2010)

Junior seems to assess the position fairly accurately.

Cubeman · Post by **Cubeman** » Sun Apr 04, 2010 2:48 pm

Houdini wrote:
Stephen Ham wrote:Here is another position that my 64-bit mp engines with 5-man tablebases claim is a White victory, yet I think it's surely a draw.
Who is claiming a White victory? If the engine at very large search depth doesn't manage to improve its initial evaluation (probably between +1 and +1.5), it is in fact claiming a draw.
The engine is telling you exactly what's going on: White has the advantage but apparently it's not enough to convert to a win.

Exactly, the only time a engine claims a win is when there is a mate anouncement.Every other evaluation is just a likely hood of winning/drawing.A stactic evaluation is a very useful clue that the position might be drawn and is quite a valuable tool which helps to study positions.
Of course not all static evals mean that draw is proven but at least you get the idea.

Stephen Ham · Post by **Stephen Ham** » Sun Apr 04, 2010 10:48 pm

Cubeman wrote:
Houdini wrote:
Stephen Ham wrote:Here is another position that my 64-bit mp engines with 5-man tablebases claim is a White victory, yet I think it's surely a draw.
Who is claiming a White victory? If the engine at very large search depth doesn't manage to improve its initial evaluation (probably between +1 and +1.5), it is in fact claiming a draw.
The engine is telling you exactly what's going on: White has the advantage but apparently it's not enough to convert to a win.
Exactly, the only time a engine claims a win is when there is a mate anouncement.Every other evaluation is just a likely hood of winning/drawing.A stactic evaluation is a very useful clue that the position might be drawn and is quite a valuable tool which helps to study positions.
Of course not all static evals mean that draw is proven but at least you get the idea.

Hello Alex and Robert,

I understand what you wrote, but respectfully disagree.

When a chess engine's evaluation shows the symbol for a winning advantage along with a confirming numeric display, whether it remains static or not after subsequent iterations, the engine thus claims a win. After all, all endgames are either decisive or drawn. Thus, a static output of a win should never be read as the engine claiming a draw. Again, the position may indeed be a draw, but the engine is not claiming that.

I agree with you that a static evaluation can sometimes clue humans that the position is a draw in reality, but more often than not, the engine is correct in evaluating the position as victorious.

Similarly, I've seen engines evaluate endgames as equal when in reality one side can win.

That's why I've posted two simple endgame positions in the past two days where all the best engines' evaluations are seriously wrong. I hope that in so doing to encourage programmers to improve endgame evaluations/performances in their next engine generations.

I'd also like to see programmers offer an endgame version of their engines - an engine with adequate endgame knowledge to be used exclusively to analyze endgames.

All the best,
Steve

Jim Walker · Post by **Jim Walker** » Mon Apr 05, 2010 1:10 am

"The surest way to remain ignorant is to be satisfied with what you think you know."

Here's another one:

"You don't know what you don't know."

lmader · Post by **lmader** » Mon Apr 05, 2010 2:42 am

Stephen Ham wrote:After all, all endgames are either decisive or drawn.

Well... in fact, any and all chess games are either decisive or drawn... eventually. Indeed, chess may be a forced draw, or a forced win for white, right from the first move.

What I think you are trying to say is that there reaches a point in a game where the outcome is unequivocally clear, or that with correct play, the outcome is known. You are calling this point "the endgame", and there is a lot of chess theory that allows humans to identify the outcome for many endgames. But certainly not all endgames.

When a chess engine's evaluation shows the symbol for a winning advantage along with a confirming numeric display, whether it remains static or not after subsequent iterations, the engine thus claims a win

So a computer program might see the outcome clearly at a certain point in the endgame and announce mate or an eval of 0 (draw). But at earlier stages in the game (even when still in the "endgame"), all engines (and humans) have farther to see, such that there will be a point that the eval will be indecisive. At this point an engine produces an eval that may be positive or negative, but this is not the engine announcing a win/loss. At that point it's akin to an eval at any other point in the game.

In other words, you wouldn't interpret a +1.5 score at move 15 as an announcement of a win for white. Similarly, an announcement of +1.5 in an "endgame" is the computer giving an indicisive evaluation because it hasn't (yet) determined the outcome conclusively.

Eelco de Groot · Post by **Eelco de Groot** » Mon Apr 05, 2010 2:47 am

Jim Walker wrote:"The surest way to remain ignorant is to be satisfied with what you think you know."

Here's another one:

"You don't know what you don't know."

Yes Jim, but:

Donald Rumsfeld: "There are known knowns. These are things we know that we know. There are known unknowns. That is to say, there are things that we now know we don’t know. But there are also unknown unknowns. These are things we do not know we don’t know."

The "there are known knowns..." statement was made at a press briefing given by former US Defense Secretary Donald Rumsfeld on February 12, 2002. Mr Rumsfeld's statement relating to the increasingly unstable situation in post-invasion Afghanistan was widely viewed as elusive and indicative of arrogance, whilst at the same time reflecting a profound, almost philosophical truth. The statement won the 2003 Foot in Mouth award from the Plain English Campaign,[1].

Or, a quote maybe even more confusing for us chessplayers: "Losing can be better than winning" - Raymond B. Furlong, from an article obviously studied by Donald Rumsfeld as well and maybe where the "known unknowns" was coming from:

The term was in use within the United States military establishment long before Rumsfeld's quote to the press in 2002. An early use of the term comes from a paper entitled Clausewitz and Modern War Gaming: losing can be better than winning by Raymond B. Furlong, Lieutenant General, USAF (Ret.) in the Air University Review, July-August 1984:

“ To those things Clausewitz wrote about uncertainty and chance, I would add a few comments on unknown unknowns--those things that a commander doesn't even know he doesn't know. Participants in a war game would describe an unknown unknown as unfair, beyond the ground rules of the game. But real war does not follow ground rules, and I would urge that games be "unfair" by introducing unknown unknowns.[5]

All this found on http://en.wikipedia.org/wiki/Unknown_unknown

I would maybe add the comment that if some GUIs give this particular endgame a symbol indicating it is a win, this is purely a GUI function and in the case of Chessbase and I believe Shredder too, the values for these symbols can and should be adjusted for the program used. The chessprogram, the engine itself, knows nothing about these symbols. In the case of Stockfish I think I would put "decisive advantage" at no less than +3 and preferably somewhere around +5, and even that is known to be sometimes wrong and not only for illegal positions thought up by Dr. Muller. But the -ChessBase?- GUI in question is probably used to the much lower values used by Rybka. In Shredder UCI GUI you can also set some of these values, I'm not really sure where the Shredder GUI uses them, but I read for instance right-clicking in the Analysis window that by default "White is winning" is only given for a value of +10.
Only from Chessbase I know the Informator-like symbols +-, +/= etc. are used for the regular engine output. Adjusting these values would already solve some problems of interpretation

Seriously though, in case of fortress positions, I'm sure Rumsfeld would agree that here we are really in the realm of known unknowns, we know these fortresses exist but it is just not easy to recognize them, not without human pattern recognition built into the chessprogram.

Eelco

michiguel · Post by **michiguel** » Mon Apr 05, 2010 3:06 am

lmader wrote:
Stephen Ham wrote:After all, all endgames are either decisive or drawn.
Well... in fact, any and all chess games are either decisive or drawn... eventually. Indeed, chess may be a forced draw, or a forced win for white, right from the first move.

What I think you are trying to say is that there reaches a point in a game where the outcome is unequivocally clear, or that with correct play, the outcome is known. You are calling this point "the endgame", and there is a lot of chess theory that allows humans to identify the outcome for many endgames. But certainly not all endgames.

When a chess engine's evaluation shows the symbol for a winning advantage along with a confirming numeric display, whether it remains static or not after subsequent iterations, the engine thus claims a win
So a computer program might see the outcome clearly at a certain point in the endgame and announce mate or an eval of 0 (draw). But at earlier stages in the game (even when still in the "endgame"), all engines (and humans) have farther to see, such that there will be a point that the eval will be indecisive. At this point an engine produces an eval that may be positive or negative, but this is not the engine announcing a win/loss. At that point it's akin to an eval at any other point in the game.

In other words, you wouldn't interpret a +1.5 score at move 15 as an announcement of a win for white. Similarly, an announcement of +1.5 in an "endgame" is the computer giving an indicisive evaluation because it hasn't (yet) determined the outcome conclusively.

What Stephen is trying to say, is that engines are clueless in this position and completely useless for a chess player that tries to analyze it. You do not need a computer to count pawns and realize that you are up in material and see that the knight is not in good position. That is basically all the information you get with the +1.5. The problem is that many endgames require a complete different approach. They require planning, which is more related to a retrograde analysis than a tree search. Since most developers are interested in increasing some elo points against other clueless engines rather than developing new AI strategies, it is unlikely that the situation will change. Sorry Stephen, engines will continue to suck at this. This is exactly what Kasparov was talking about in a recent interview.

Miguel

Another Impossible Endgame For Engines

Another Impossible Endgame For Engines

Re: Another Impossible Endgame For Engines

Re: Another Impossible Endgame For Engines

Re: Another Impossible Endgame For Engines

Re: Another Impossible Endgame For Engines

Re: Another Impossible Endgame For Engines

Re: Another Impossible Endgame For Engines

Re: Another Impossible Endgame For Engines

Re: Another Impossible Endgame For Engines

Re: Another Impossible Endgame For Engines