
However, recently while testing king safety in self-play, I noticed something; the majority of decisive games are not being won or lost in the middlegame, but rather in the endgame - specifically rook endgames or endgames with a couple pieces each. The manner in which it happens is usually that Willow is shuffling around in a drawish piece endgame, then suddenly the eval spikes and after a couple moves it drops a pawn or trades into a won/lost pawn endgame (it has no problems with those, solving Fine 70 in one second for example).
It also gets itself stuck thinking its winning in drawn endgames much more than it should. I added a patch to stop it going into pawnless endgames like RvB RvRN etc., but it still has a lot of games that wind up in positions like this, where Willow thinks Black is completely winning and continues to think so until the 50 move rule finally steps in.
[fen]8/8/8/8/4rBbk/4Pp2/5K2/4N3 w - - 0 1[/fen]
What can I do to improve its endgame play? I have thought about using 2 move repetition checks at every node except the root, to stop it from delaying moves that reduce its advantage by repeating once. I've also thought about adding a scaled penalty (like "if pawns < 4 then eval = (eval/4-pawns)") to prevent trading off too many pawns, but I worry that doing that may make Willow unable to see continuations that end in a winning Lucena, for instance.
It might be too that the search just needs to be improved somehow, though I already do most of the basic stuff like LMR, PVS, NMP if both sides have a piece, etc. In an average rook endgame Willow gets to about depth 15 in one second, there's no extensions but it doesn't reduce checks. Maybe I should add passed pawn extensions? Or perhaps the depth it reaches just isn't enough to play endgames properly in fast testing?
As for evaluation I've got PSTs, piece values, and mobility values for the endgame. I also have passed pawn evals but that's just a general 10*rank centipawn bonus. (Even that simple passed pawn eval was enough to give a >50 elo gain in self play, perhaps improving that will help a lot?)
I'd appreciate any advice. It's getting quite annoying to try and test king safety and space evals for the early middlegame, only for it to feel like any game that isn't very clearly won going into the endgame is a coinflip!