I've written some code that generates the Endgame Classification of a given position and then researched what kind of endgames are drawn despite a significant material advantage for the stronger side.algerbrex wrote: ↑Sun Sep 04, 2022 6:32 pmYea, I noticed that when I was playing some games for fun between Leorik and Blunder. Many times Leorik would be reporting a +300 cp score in a completely drawn endgame because it doesn't have the knowledge yet that king and minor vs king is a draw. Should be a pretty straightforward fix. Especially since those examples are not just theoretical draws, but actual draws. For theoretical draws where it's still possible for the opponent to blunder, I just divide the evaluation by a constant like 16 to indicate to Blunder that that part of the tree leads to a drawn endgame, so it'll prefer to leave pawns on the board when possible.lithander wrote: ↑Sun Sep 04, 2022 3:39 pm One class of blunders has to do with the endgame. For example Leorik doesn't realize that a lone bishop or lone knight doesn't allow it to win. To Leorik it looks like one side is the value of a minor piece up. (200-300cp) It will trade pawns away for gaining this "advantage". I suppose I should encode knowledge for piece-combinations that Leorik should treat as drawn instead of running it's PST based eval on them.
Code: Select all
public static HashSet<string> Drawn = new()
{
"KNvK", "KvKN",
"KBvK", "KvKB",
"KNNvK", "KvKNN",
"KNNvKN", "KNvKNN",
"KNNvKB", "KBvKNN",
"KRvKN", "KNvKR",
"KRvKB", "KBvKR"
};
Whenever I normally use the static evaluation score I now pass it through a little function:
Code: Select all
public int ScaleEndgame(int score)
{
int cnt = Bitboard.PopCount(Black | White);
if (cnt > 5)
return score;
if (Drawn.Contains(Notation.GetEndgameClass(this)))
return score >> 3;
else
return score;
}
So this code should do what you suggested, right? And of course the implementation is horrendously inefficient. So I ran a selftest against the previous stable version where each engine get's unlimited time but is limited by 2M nodes max which should factor the engine-speed out of the equation.
Code: Select all
Score of Leorik 2.2.8e vs Leorik 2.2.6: 692 - 521 - 901 [0.540] 2114
... Leorik 2.2.8e playing White: 440 - 245 - 372 [0.592] 1057
... Leorik 2.2.8e playing Black: 252 - 276 - 529 [0.489] 1057
... White vs Black: 716 - 497 - 901 [0.552] 2114
Elo difference: 28.2 +/- 11.2, LOS: 100.0 %, DrawRatio: 42.6 %