Really serious endgame issues

Discussion of chess software programming and technical issues.

Moderator: Ras

OliverBr
Posts: 813
Joined: Tue Dec 18, 2007 9:38 pm
Location: Munich, Germany
Full name: Dr. Oliver Brausch

Really serious endgame issues

Post by OliverBr »

Hi,

It's know since a long time that OliThink is weak in endgames (not using syzygy tablebases or neural net ist not an excuse for this bad play).
Now I have witnessed an horrible endgame where white looked much better. At some point (around 50) OliThink throws out a lot of "draw" evaluations while the opponent (Qapla! lol) improves its rating from move to move.

OliThink does not even have an special implementation for endgame, but it botches it almost always.
Who may help by example of this game?

Chess Engine OliThink: http://brausch.org/home/chess
OliThink GitHub:https://github.com/olithink
OliverBr
Posts: 813
Joined: Tue Dec 18, 2007 9:38 pm
Location: Munich, Germany
Full name: Dr. Oliver Brausch

Re: Really serious endgame issues

Post by OliverBr »

Another issue: Why can't xboard (4.9.1) load the game and display the moves? What's wrong?
Chess Engine OliThink: http://brausch.org/home/chess
OliThink GitHub:https://github.com/olithink
OliverBr
Posts: 813
Joined: Tue Dec 18, 2007 9:38 pm
Location: Munich, Germany
Full name: Dr. Oliver Brausch

Re: Really serious endgame issues

Post by OliverBr »

Also very strange since a long time: All other engines in the Amateur Division 11 are utterly outclassed when OliThink plays them on a linux server or locally.

With the same opening as above and white, OliThink won 10 of 10 games handily against Qapla.
Also a bulk test against it shows that there is about 118 ELO difference between OliThink and Quapla:

Code: Select all

   # PLAYER             :  RATING  ERROR  POINTS  PLAYED   (%)     W    D     L  D(%)  CFS(%)
   1 OliThink 5.11.4    :       0   ----  2295.0    3466  66.2  1892  806   768  23.3     100
   2 Qapla 0.3.2        :    -118     10  1171.0    3466  33.8   768  806  1892  23.3     ---

White advantage = 24.94 +/- 5.57
Draw rate (equal opponents) = 25.14 % +/- 0.75

Same for OliThink and Dumb, even 137 ELO:

Code: Select all

   # PLAYER             :  RATING  ERROR  POINTS  PLAYED   (%)     W    D     L  D(%)  CFS(%)
   1 OliThink 5.11.4    :       0   ----  2087.0    3047  68.5  1779  616   652  20.2     100
   2 Dumb 2.3           :    -137     12   960.0    3047  31.5   652  616  1779  20.2     ---

White advantage = 22.34 +/- 6.04
Draw rate (equal opponents) = 22.30 % +/- 0.80
In those tournaments Dumb and Qapla perform better than Olithink. I have seen it live. Example above.
What exactly is bringing the performance of OliThink so down in those tournaments? Can it be just the CPU? Smaller Level 1 Cache? But 130 ELO??
Chess Engine OliThink: http://brausch.org/home/chess
OliThink GitHub:https://github.com/olithink
OliverBr
Posts: 813
Joined: Tue Dec 18, 2007 9:38 pm
Location: Munich, Germany
Full name: Dr. Oliver Brausch

Re: Really serious endgame issues

Post by OliverBr »

OliverBr wrote: Thu Jun 12, 2025 3:28 pm Another issue: Why can't xboard (4.9.1) load the game and display the moves? What's wrong?
It was actually a missing line break. Once I added about 3 linebreaks, the pgn loaded. Strange bug for a great software.
Chess Engine OliThink: http://brausch.org/home/chess
OliThink GitHub:https://github.com/olithink
OliverBr
Posts: 813
Joined: Tue Dec 18, 2007 9:38 pm
Location: Munich, Germany
Full name: Dr. Oliver Brausch

Re: Really serious endgame issues

Post by OliverBr »

Now back to the endgame issue. Once i could load the pgn into xboard and have Stockfish 15.1 analyze it, I was right in my assumption:

OliThink was +5.79 before after Move 56.
I am not sure how many play chess here, I am only a mediocre player (let's say 1800 ELO), but I was sure I would win this against easily:
Here the position after 56...Kf6:

8 
7 
6 
5 
4 
3 
2 
1 
abcdefgh

8/3R4/p4k2/2pPp2p/1p2K1p1/1P1P2P1/r1P3P1/8 w - - 3 57

And here Stockfish's analysis:

Code: Select all

37	+5.90 	423.5M	6:09.49	Rd6+ Kg7 Re6 Kf7 Rh6 Kg7 d6 Kxh6 d7 Ra1 d8=Q Rf1 Kd5 Rf5 Qd6+ Kg7 Qe7+ Kg6 Qe6+ Kg5 Kxc5 a5 Qg8+ Kh6 Kb6 Rg5 Qh8+ Kg6 Kxa5 Rf5 Qe8+ Kg7 Qe7+ Kh8 Qe6 Rf1 Qh6+ Kg8 Qg6+ Kf8 Qxh5 Kg7 Qg5+ Kh7 Qe7+ Kg6 
36	+5.79 	326.3M	4:43.62	Rd6+ Kg7 Re6 Kf7 Rh6 Kg7 d6 Kxh6 d7 Ra1 d8=Q Rf1 Kd5 Rf5 Qd6+ Kg7 Qe7+ Kg6 Qe6+ Kg5 Kxc5 a5 Qg8+ Kh6 Kb6 Rg5 Qh8+ Kg6 Kxa5 Rf5 Ka4 Rf7 Qe8 Kf6 Ka5 Rb7 Qxh5 Rb8 Ka4 Ra8+ Kb5 Rb8+ Ka5 
Now, for whatever reasons, OliThink assumpts score 0.00 (???) the next moves thus plays bad moves, because "it's draw anyway".
Here: Instead of e.g. Rd6+ it plays Rh7 (???) and loses the all advantage at once.

Now Stockfish says: the complete advantage is gone:

Code: Select all

53	  0.00 	231.2M	4:29.00	Rxc2 d6 Re2+ Kd5 Rd2 Rh6+ Kf7 Ke4 Rd1 d7 Ke7 Rd6 Kd8 Kxe5 Rd2 Ke6 Re2+ Kd5 Rxg2 Kc6 Rf2 Rh6 Rf8 Rxh5 Rf6+ Kxc5 Rf3 Rd5 Rxg3 Kc6 Rf3 Rd6 Rf8 Rg6 a5 d4 a4 bxa4 b3 Rxg4 b2 Rg1 Rf6+ Kd5 Kxd7 Rb1 Rd6+ Kc4 Rc6+ Kb5 Rd6 Rxb2 Rxd4 a5
 
It is draw until move 63, where Stockfish suggests d7:

Code: Select all

59	  0.00 	27.0M  	0:26.05	d7 Rd3 Rh6 Ke7 Rh7+ Kf6 Rh6+ Ke7 
I remember well this move because I was sure, 63. d7 would be played, but no... it's 63. Rh7 (???). This time the draw score would be correct for 63. d7, but not for 63. Rh7.

Game lost, next move and score for Black:

Code: Select all

36	+5.26 	168.5M	2:50.76	Kg6 Ra7 Rxg3 Rxa6 Rd3 d7 Kf7 Rb6 Ke7 Kxc5 Rd4 Rb7 b3 Rxb3 Kxd7 Rb6 Ke7 Rg6 Kf7 Rh6 Rd2 Kc4 Rxg2 Kd3 Rf2 Ke4 Rf1 Kxe5 g3 Rh7+ Kg6 Rh3 Rf5+ Ke4 g2 Rg3+ Rg5 Rxg2 Rxg2 Kf4 
Now, who can help with this issue? Doesn't it happen with other engines maybe? Did you encounter such problems? I would like to hear experiences.
Chess Engine OliThink: http://brausch.org/home/chess
OliThink GitHub:https://github.com/olithink
OliverBr
Posts: 813
Joined: Tue Dec 18, 2007 9:38 pm
Location: Munich, Germany
Full name: Dr. Oliver Brausch

Re: Really serious endgame issues

Post by OliverBr »

Now the first blund is 100% connected with a repetition score:

Code: Select all

55. Rxd6+ ... Kf7
56. Rd7+  ...  Kf6
Now if:

Code: Select all

57. Rd6+ ... Kf7
It would be a two time repetition and therefore 57. Rd6+ is not played (in order to avoid 3-fold repetition). I have changed the repetition scoring before but didn't bring anything. Maybe I have to look closer.

How exactly is the correct way to store a repetition score into a TT-Hashtable? It doesn't know how many times this position existed before or does it?
PS: I see now that this is actually an issue. Funny that all those new engines probably handle it correctly, while a decades old engine does not.
Chess Engine OliThink: http://brausch.org/home/chess
OliThink GitHub:https://github.com/olithink
Bjoern
Posts: 16
Joined: Mon Sep 26, 2016 8:59 pm

Re: Really serious endgame issues

Post by Bjoern »

Hi,

I'm am not looking at other sources, probably there's a better solution than mine.

In TT I store also the distance from root-position. If the distance from root is higher than
the one in TT, I do not overwrite the score. This avoids false scores in the TT in my case.

Looking forward to omit ths TT data in the future....

Best regards
Björn
OliverBr
Posts: 813
Joined: Tue Dec 18, 2007 9:38 pm
Location: Munich, Germany
Full name: Dr. Oliver Brausch

Re: Really serious endgame issues

Post by OliverBr »

Hi Björn,

thank you for your reply. I will look into it.

But actually I have learned that the repetition/tt-table issue is actually not the reason for 57. Rh7.
The engine is actually thinking it is the best move. And it's not alone, a lot of strong (non-NNUE) engines share this opinion until quite late.
Even Stockfish goes for it first. But 57. Rh7 throws away the win. This is most disturbing.
So maybe 57. Rh7 was a difficult move in a difficult endgame.

But this doesn't justify 63. Rh7 instead of 63. d7. Because this move prefered by most engined and by the naked human eye.

Maybe it's solved like this:

Code: Select all

if (move == 'Rh7') score -= 32; 
Chess Engine OliThink: http://brausch.org/home/chess
OliThink GitHub:https://github.com/olithink
OliverBr
Posts: 813
Joined: Tue Dec 18, 2007 9:38 pm
Location: Munich, Germany
Full name: Dr. Oliver Brausch

Re: Really serious endgame issues

Post by OliverBr »

This looks reasonable now: OliThink comes only to 63. Rh7 with TT-Hashtable pruning. Without it, it moves 63. d7 or 63 Rxe5, both resulting in a draw.
Chess Engine OliThink: http://brausch.org/home/chess
OliThink GitHub:https://github.com/olithink
OliverBr
Posts: 813
Joined: Tue Dec 18, 2007 9:38 pm
Location: Munich, Germany
Full name: Dr. Oliver Brausch

Re: Really serious endgame issues

Post by OliverBr »

OliverBr wrote: Thu Jun 12, 2025 11:03 pm This looks reasonable now: OliThink comes only to 63. Rh7 with TT-Hashtable pruning. Without it, it moves 63. d7 or 63 Rxe5, both resulting in a draw.
But the TT-pruning wasn't the reason. It's just accidental that changing anything in the search/eval yields to other moves than 63. Rh7. It looks as it's another issue. In this case a endgame table or a neural net would avoid a bad move.
Chess Engine OliThink: http://brausch.org/home/chess
OliThink GitHub:https://github.com/olithink