Endgame fortress handling

Lyudmil Tsvetkov · Post by **Lyudmil Tsvetkov** » Wed Mar 25, 2015 5:21 pm

I take the liberty to open a new thread and repost 2 of my messages in the next thread, as the topic concerns solving endgame fortresses, and this is a very old and nasty problem with engines of all strength.

[d]8/8/4k3/4P3/2bP4/2P1K3/3B4/8 w - - 0 1

well, this is a draw. Presumably, no engine uses 7-men tbs, so, as a general eval rule does not help here, the only way to make your engine not look stupid is, if you see that, white having score advantage more than 3 full pawns, and the score does not rise by more than half a pawn in the next 10 moves, this is simply a draw.

No pawn sacrifices that could not be seen in the search.

In this case, after 10 played moves, your engine will show 0.0 score, no elo gain, but how nice to watch by a human?
Inasted, the other option is to shuffle another 50 moves with huge artificial winning scores.

[d]6k1/5p2/4r2p/8/3Q4/8/6P1/6K1 w - - 0 1

Typical endgame fortress.

Eval changes might solve similar fortresses, but you have to specify a lot.

Instead, simply declaring draw, when you see score is above 3 full pawns in late endgame and has not changed up by more than half a pawn in the last 10 moves, should be the right solution.

How nice to see an engine showing 0.0 here after 10 played moves!

Otherwise, without 7-men tbs, we are to witness 50 moves shuffling here and there, with scores trying to convince us white is winning.

Actually, I do not see an exception to this rule in late endgame, does anyone?

OK, this does not gain you elo, as it does not cut the tree, does not pick optimal moves, and might even lose half an elo or so, as an additional rule is specified, but play will certainly look much more appealing and human-like.

Actually, I am not certain specifying such a rule will lose more than 0.25elo.

[d][d]8/1b6/8/8/7p/3k2pP/6P1/5K2 w - - 0 1

Seemingly, the rule also applies in Vincent's position above: black leads by more than 3 full pawns in terms of score, the next 10 moves the score does not improve by more than half a pawn, therefore the game is a draw.

This is possible of course, as in the late endgame, where most fortress positions occur, there are simply no available subtle pawn breaks that the search could not easily see, in sharp distinction to rich mg positions, where such breaks mihgt be easily missed by engines.

So I really think the above rule is good to apply in engines, even if it loses 0.5 elo in the process.

Actually, does not that kind of solve most endgame fortress positions, a crux for computer chess for a very long time?

If you are close to such an endgame position in the tree, and history data after a search suggests that from ply 10 until ply 30 the score for this position in the tree does not increase by more than half a pawn, while your eval tells you you are leading by more than 3 full pawns, you might use this information to take pruning decisions; you know this is just a draw, and maybe there are other lines with smaller than 3 full pawns scores that should be preferred instead.

If this is the case, a similar patch with such a rule could even gain you some elo.

What do you think, Daniel?

Even if Daniel disagrees, I am sure Carl will support me on this.

So, in short, the suggestion to try solving most endgame fortress positions runs as follows:

you declare the game as a draw, whenever the following conditions are met:

- non-pawn material is less than 1/4 total non-pawn material
- you are leading by more than 300cps in score
- in the last 10 moves/20 plies the score has not improved by more than half a pawn

If you are close to such a fortress position in the tree, leading by more than 300cps, and from ply 5 until ply 25 the score has not increased by more than 50cps, you might even use this rule to take pruning decisions, abandoning this sterile drawish line, and instead preferring a line with a smaller eval score, but more probable to win.

So, after all, after appropriate tuning, this might even gain you elo.

What do you think of the above suggestions?

Vinvin · Post by **Vinvin** » Wed Mar 25, 2015 5:48 pm

I'm about sure I saw a patch in Stockfish testing ( http://tests.stockfishchess.org/tests ) to reduce progressively the score if there's no progress (push pawn or take a piece) after 30 moves (60 ply deep).
That would solve position number 1 and number 3.
Number 2 is a bit longer because the g pawn can be push and then the g black pawn can be taken too ...

Lyudmil Tsvetkov · Post by **Lyudmil Tsvetkov** » Wed Mar 25, 2015 6:08 pm

Vinvin wrote:I'm about sure I saw a patch in Stockfish testing ( http://tests.stockfishchess.org/tests ) to reduce progressively the score if there's no progress (push pawn or take a piece) after 30 moves (60 ply deep).
That would solve position number 1 and number 3.
Number 2 is a bit longer because the g pawn can be push and then the g black pawn can be taken too ...

That was pushed by Joerg and it failed, but I think it concerned all mg and eg positions alike, and as we discussed with Daniel, mg positions should be necessarily excluded, as they are too complex and there are always possible pawn breaks there, which the engine might not see for quite a while.

So I think a similar patch could work only for the endgame, where there are not reasonable pawn breaks which the engines will not see in their search.

Mg positions are simply too complex for a similar patch to apply there.

Lyudmil Tsvetkov · Post by **Lyudmil Tsvetkov** » Wed Mar 25, 2015 6:15 pm

Other late eg positions:

[d]8/1p1k4/pPp1p1p1/P1PpPpPp/3P1P1P/8/3B4/6K1 w - - 0 1

does this rule see the draw above?

yes

[d]8/1p1k4/pPp1p1p1/P1PpPpPp/3P1P1P/8/1R1B4/6K1 w - - 0 1

or here?

yes

[d]8/3k4/p1p1p1p1/PpPpPpPp/1P1P1P1P/8/3Q4/6K1 w - - 0 1

here?

yes

Lyudmil Tsvetkov · Post by **Lyudmil Tsvetkov** » Wed Mar 25, 2015 6:44 pm

other eg fortress positions:

[d]8/2k1r3/1p6/1Pb5/2P5/3Q4/6K1/8 w - - 0 1

does this rule see the draw above?

yes

[d]8/1p2ppk1/1Pn4p/p1P1n1pP/P3P1P1/4Q3/8/6K1 w - - 0 1

here?

yes

[d]6k1/5p2/p5n1/P7/7p/5p1P/1Q2bP1P/6K1 w - - 0 1

here?

yes

So, I think more or less all endgame fortress positions would obey this rule.

Of course, this would be true only of positions with eval score larger than 300cps, but you could also experiment with lower score advantages, say 150cps, to include some R vs minor piece fortresses.

Of course, in the latter case, the 50cps increase of score over the last 10 moves could be lowered to just some 20-30cps, as there will be fewer available eval terms.

Please note, that the 50cps increase of score is a necessary prerequisite in order to ensure that the game really goes nowhere as depth increases. If you would just check instead if score has not increased at all, this might be wrong, as eval scores change a bit even in exhausted fortress positions, with changing mobility of pieces, psqt, some attacks, etc.

So you really need that 50cps condition.

I also think that the search approach to solving endgame fortresses is the right one, as there are simply too many different types of endgame fortress positions and, in order to capture them all in eval, you need to add volumes of eval.

Ferdy · Post by **Ferdy** » Wed Mar 25, 2015 7:00 pm

Back then I try to find which engine has more reliable scores in some fortress positions. The closer to zero the better. I will add new positions to collection and run again these engines, especially now that there are also new engine versions.

Link:
http://talkchess.com/forum/viewtopic.ph ... 24&t=54697

Code: Select all

A. Platform: 
System   : Windows 
Release  : 7 
Version  : 6.1.7601 
Machine  : AMD64 
processor: Intel64 Family 6 Model 42 Stepping 7, GenuineIntel 

B. Engine parameters: 
Threads  : 1 
Hash     : 64mb 
Time/pos : 1000ms 

C. Test settings: 
Total engine count  : 45 
Total positions     : 8 (input file: test.fen) 
Total max points    : 800 
Estimated total time: 8 pos x 1000ms/pos = 8000 ms 

D. Summary high points is better: 
 1 id name Fire 4 x64                         (time 6101 ms, Points 537, ratio  67.1%) 
 2 id name Gull 3 x64                         (time 8000 ms, Points 475, ratio  59.4%) 
 3 id name Houdini 4 x64                      (time 8000 ms, Points 472, ratio  59.0%) 
 4 id name Critter 1.6a 64-bit                (time 6110 ms, Points 466, ratio  58.2%) 
 5 id name Strelka 6 w32                      (time 8000 ms, Points 466, ratio  58.2%) 
 6 id name Komodo 6 64-bit                    (time 7203 ms, Points 358, ratio  44.8%) 
 7 id name Texel 1.04 64-bit                  (time 6930 ms, Points 341, ratio  42.6%) 
 8 id name Stockfish 131214 64 POPCNT         (time 7139 ms, Points 296, ratio  37.0%) 
 9 id name HIARCS 14 WCSC                     (time 6161 ms, Points 269, ratio  33.6%) 
10 id name Bouquet 1.8 x64                    (time 8112 ms, Points 266, ratio  33.2%) 
11 id name Hannibal 1.4x64                    (time 5572 ms, Points 260, ratio  32.5%) 
12 id name Booot 5.2.0(64)                    (time  140 ms, Points 212, ratio  26.5%) 
13 id name Equinox 3.30 x64mp                 (time 6660 ms, Points 204, ratio  25.5%) 
14 id name Deuterium v14.4.35.17 64bit POPCNT (time 6902 ms, Points 200, ratio  25.0%) 
15 id name Fruit reloaded 2.1                 (time 6099 ms, Points 199, ratio  24.9%) 
16 id name Octochess revision 5190            (time 4945 ms, Points 186, ratio  23.2%) 
17 id name Amyan 1.72                         (time   80 ms, Points 173, ratio  21.6%) 
18 id name Naum 4.6                           (time 6726 ms, Points 163, ratio  20.4%) 
19 id name Protector 1.7.0                    (time 7488 ms, Points 159, ratio  19.9%) 
20 id name Spike 1.4                          (time 7488 ms, Points 130, ratio  16.2%) 
21 id name DiscoCheck 5.2.1                   (time 6705 ms, Points 127, ratio  15.9%) 
22 id name Ruffian 1.0.5                      (time 5920 ms, Points 115, ratio  14.4%) 
23 id name Yace 0.99.87                       (time 7936 ms, Points 115, ratio  14.4%) 
24 id name Gaviota v1.0                       (time 6661 ms, Points 110, ratio  13.8%) 
25 id name Andscacs 0.71                      (time 5981 ms, Points  96, ratio  12.0%) 
26 id name Maverick 0.51 x64                  (time 6193 ms, Points  94, ratio  11.8%) 
27 id name Nebula 2.0                         (time 6490 ms, Points  93, ratio  11.6%) 
28 id name cheng4 0.36c                       (time 6666 ms, Points  90, ratio  11.2%) 
29 id name Deuterium v14.3.34.130             (time 7316 ms, Points  90, ratio  11.2%) 
30 id name Nemo SP64o 1.0.1 Beta              (time 8000 ms, Points  88, ratio  11.0%) 
31 id name AnMon 5.75                         (time 7083 ms, Points  85, ratio  10.6%) 
32 id name Rybka 2.3.2a mp                    (time 5625 ms, Points  80, ratio  10.0%) 
33 id name Rodent 1.6 (build 6)               (time 5835 ms, Points  73, ratio   9.1%) 
34 id name Senpai 1.0                         (time 7032 ms, Points  70, ratio   8.8%) 
35 id name Arasan 17.4                        (time 8149 ms, Points  67, ratio   8.4%) 
36 id name Rhetoric 1.4.1 x64                 (time 5505 ms, Points  66, ratio   8.2%) 
37 id name Bobcat 3.25                        (time 6503 ms, Points  49, ratio   6.1%) 
38 id name GreKo 12.1                         (time 6255 ms, Points  49, ratio   6.1%) 
39 id name Vajolet2 1.45                      (time 6844 ms, Points  41, ratio   5.1%) 
40 id name spark-1.0                          (time 8112 ms, Points  37, ratio   4.6%) 
41 id name DisasterArea-1.54                  (time 6724 ms, Points  32, ratio   4.0%) 
42 id name Daydreamer 1.75 JA                 (time 6030 ms, Points  19, ratio   2.4%) 
43 id name GNU Chess 5.60-64                  (time 6357 ms, Points  14, ratio   1.8%) 
44 id name Quazar 0.4 x64                     (time 7207 ms, Points   8, ratio   1.0%) 
45 id name iCE 2.0 v2240 x64/popcnt           (time 2028 ms, Points   6, ratio   0.8%) 

E. Positions: 
 1 6k1/8/6PP/3B1K2/8/2b5/8/8 b - - 0 1 
 2 8/8/r5kP/6P1/1R3K2/8/8/8 w - - 0 1 
 3 7k/R7/7P/6K1/8/8/2b5/8 w - - 0 1 
 4 8/8/5k2/8/8/4qBB1/6K1/8 w - - 0 1 
 5 8/8/8/3K4/8/4Q3/2p5/1k6 w - - 0 1 
 6 8/8/4nn2/4k3/8/Q4K2/8/8 w - - 0 1 
 7 8/k7/p7/Pr6/K1Q5/8/8/8 w - - 0 1 
 8 k7/p4R2/P7/1K6/8/6b1/8/8 w - - 0 1 

F. Point System: 
score <= abs(50)  : 100 points 
score <= abs(100) : 61 - 70, points 
score <= abs(150) : 51 - 60, points 
score <= abs(200) : 41 - 50, points 
score <= abs(250) : 31 - 40, points 
score <= abs(300) : 21 - 30, points 
score <= abs(350) : 11 - 20, points 
score <= abs(400) :  1 - 10, points 
Other scores      :  0 points 

G. Engine that does not report time: 
 1 id name Gull 3 x64                  
 2 id name Nemo SP64o 1.0.1 Beta

Lyudmil Tsvetkov · Post by **Lyudmil Tsvetkov** » Wed Mar 25, 2015 8:46 pm

Ferdy wrote:Back then I try to find which engine has more reliable scores in some fortress positions. The closer to zero the better. I will add new positions to collection and run again these engines, especially now that there are also new engine versions.

Link:
http://talkchess.com/forum/viewtopic.ph ... 24&t=54697

Code: Select all

A. Platform: 
System   : Windows 
Release  : 7 
Version  : 6.1.7601 
Machine  : AMD64 
processor: Intel64 Family 6 Model 42 Stepping 7, GenuineIntel 

B. Engine parameters: 
Threads  : 1 
Hash     : 64mb 
Time/pos : 1000ms 

C. Test settings: 
Total engine count  : 45 
Total positions     : 8 (input file: test.fen) 
Total max points    : 800 
Estimated total time: 8 pos x 1000ms/pos = 8000 ms 

D. Summary high points is better: 
 1 id name Fire 4 x64                         (time 6101 ms, Points 537, ratio  67.1%) 
 2 id name Gull 3 x64                         (time 8000 ms, Points 475, ratio  59.4%) 
 3 id name Houdini 4 x64                      (time 8000 ms, Points 472, ratio  59.0%) 
 4 id name Critter 1.6a 64-bit                (time 6110 ms, Points 466, ratio  58.2%) 
 5 id name Strelka 6 w32                      (time 8000 ms, Points 466, ratio  58.2%) 
 6 id name Komodo 6 64-bit                    (time 7203 ms, Points 358, ratio  44.8%) 
 7 id name Texel 1.04 64-bit                  (time 6930 ms, Points 341, ratio  42.6%) 
 8 id name Stockfish 131214 64 POPCNT         (time 7139 ms, Points 296, ratio  37.0%) 
 9 id name HIARCS 14 WCSC                     (time 6161 ms, Points 269, ratio  33.6%) 
10 id name Bouquet 1.8 x64                    (time 8112 ms, Points 266, ratio  33.2%) 
11 id name Hannibal 1.4x64                    (time 5572 ms, Points 260, ratio  32.5%) 
12 id name Booot 5.2.0(64)                    (time  140 ms, Points 212, ratio  26.5%) 
13 id name Equinox 3.30 x64mp                 (time 6660 ms, Points 204, ratio  25.5%) 
14 id name Deuterium v14.4.35.17 64bit POPCNT (time 6902 ms, Points 200, ratio  25.0%) 
15 id name Fruit reloaded 2.1                 (time 6099 ms, Points 199, ratio  24.9%) 
16 id name Octochess revision 5190            (time 4945 ms, Points 186, ratio  23.2%) 
17 id name Amyan 1.72                         (time   80 ms, Points 173, ratio  21.6%) 
18 id name Naum 4.6                           (time 6726 ms, Points 163, ratio  20.4%) 
19 id name Protector 1.7.0                    (time 7488 ms, Points 159, ratio  19.9%) 
20 id name Spike 1.4                          (time 7488 ms, Points 130, ratio  16.2%) 
21 id name DiscoCheck 5.2.1                   (time 6705 ms, Points 127, ratio  15.9%) 
22 id name Ruffian 1.0.5                      (time 5920 ms, Points 115, ratio  14.4%) 
23 id name Yace 0.99.87                       (time 7936 ms, Points 115, ratio  14.4%) 
24 id name Gaviota v1.0                       (time 6661 ms, Points 110, ratio  13.8%) 
25 id name Andscacs 0.71                      (time 5981 ms, Points  96, ratio  12.0%) 
26 id name Maverick 0.51 x64                  (time 6193 ms, Points  94, ratio  11.8%) 
27 id name Nebula 2.0                         (time 6490 ms, Points  93, ratio  11.6%) 
28 id name cheng4 0.36c                       (time 6666 ms, Points  90, ratio  11.2%) 
29 id name Deuterium v14.3.34.130             (time 7316 ms, Points  90, ratio  11.2%) 
30 id name Nemo SP64o 1.0.1 Beta              (time 8000 ms, Points  88, ratio  11.0%) 
31 id name AnMon 5.75                         (time 7083 ms, Points  85, ratio  10.6%) 
32 id name Rybka 2.3.2a mp                    (time 5625 ms, Points  80, ratio  10.0%) 
33 id name Rodent 1.6 (build 6)               (time 5835 ms, Points  73, ratio   9.1%) 
34 id name Senpai 1.0                         (time 7032 ms, Points  70, ratio   8.8%) 
35 id name Arasan 17.4                        (time 8149 ms, Points  67, ratio   8.4%) 
36 id name Rhetoric 1.4.1 x64                 (time 5505 ms, Points  66, ratio   8.2%) 
37 id name Bobcat 3.25                        (time 6503 ms, Points  49, ratio   6.1%) 
38 id name GreKo 12.1                         (time 6255 ms, Points  49, ratio   6.1%) 
39 id name Vajolet2 1.45                      (time 6844 ms, Points  41, ratio   5.1%) 
40 id name spark-1.0                          (time 8112 ms, Points  37, ratio   4.6%) 
41 id name DisasterArea-1.54                  (time 6724 ms, Points  32, ratio   4.0%) 
42 id name Daydreamer 1.75 JA                 (time 6030 ms, Points  19, ratio   2.4%) 
43 id name GNU Chess 5.60-64                  (time 6357 ms, Points  14, ratio   1.8%) 
44 id name Quazar 0.4 x64                     (time 7207 ms, Points   8, ratio   1.0%) 
45 id name iCE 2.0 v2240 x64/popcnt           (time 2028 ms, Points   6, ratio   0.8%) 

E. Positions: 
 1 6k1/8/6PP/3B1K2/8/2b5/8/8 b - - 0 1 
 2 8/8/r5kP/6P1/1R3K2/8/8/8 w - - 0 1 
 3 7k/R7/7P/6K1/8/8/2b5/8 w - - 0 1 
 4 8/8/5k2/8/8/4qBB1/6K1/8 w - - 0 1 
 5 8/8/8/3K4/8/4Q3/2p5/1k6 w - - 0 1 
 6 8/8/4nn2/4k3/8/Q4K2/8/8 w - - 0 1 
 7 8/k7/p7/Pr6/K1Q5/8/8/8 w - - 0 1 
 8 k7/p4R2/P7/1K6/8/6b1/8/8 w - - 0 1 

F. Point System: 
score <= abs(50)  : 100 points 
score <= abs(100) : 61 - 70, points 
score <= abs(150) : 51 - 60, points 
score <= abs(200) : 41 - 50, points 
score <= abs(250) : 31 - 40, points 
score <= abs(300) : 21 - 30, points 
score <= abs(350) : 11 - 20, points 
score <= abs(400) :  1 - 10, points 
Other scores      :  0 points 

G. Engine that does not report time: 
 1 id name Gull 3 x64                  
 2 id name Nemo SP64o 1.0.1 Beta

Thanks Ferdinand.

As I just quickly composed the above positions, it might be the case that at least one of them is faulty.
Actually, if there is such position, this should be the position with Q vs 2 knights. I did not quite pay attention that the white queen can penetrate through f5, and if e6 to prevent that, go back and penetrate through d6. So, in order to be certain the position is correct, I should check tactical lines, but do not quite have the time to do that now. Instead, if you want to keep that position, you might simply add one black pawn on f4, one white pawn on f3, and move the white queen to e2 - this is certainly a draw, with no penatration possible.

I hope other positions are correct.

lech · Post by **lech** » Wed Mar 25, 2015 9:25 pm

Engines can recognize some fortress positions very well. They tell it by the unchanged stable score (by many depths). Only machines or weak chessplayers don't understand it.

The problem is only, if to get a fortress it needs to sacrifice a piece. Sometimes the search is not able to return a correct move. I tried to solve (help) it by the last Sting SF 4.8.4 and it works in many positions.

cdani · Post by **cdani** » Wed Mar 25, 2015 9:32 pm

Lyudmil Tsvetkov wrote:What do you think, Daniel?

That I will try

hgm · Post by **hgm** » Wed Mar 25, 2015 11:13 pm

I don't think having to search 10 moves (20 ply!) to recognize these positions as a draw is the right approach. Because they are all very easily recognizable as fortresses, they could already be recognized as draws at 0 ply. It is just a matter of having some knowledge in the eval, and wrapping it such that it won't take measurable time to apply it.

The positions you show have cheap and easy triggers:
*) unlike Bishops
*) connected passers vs Bishop
*) almost all value in a single piece
*) King trapped
*) interlocked Pawn chains of >6 Pawns

These are things that almost any evaluation looks at anyway (e.g. when evaluating King safety, Pawn structure or material balance). They could be used to trigger a slightle more elaborate analysis of the positions. Like if Queen and non-passer are your only possession, can you ever hope to gobble up an unprotected piece? (As even a 2-for-1 trade means the end of your ambitions.) Well, if every opponent piece is already protected (also something that can be easily checked), the chances for that seem slim. And your King is cutoff from his. (Slightly more difficult to figure out, but the chances that you would pass the lone-Queen + all-protected test are already so slim that you would virtually never have to do it.)

If engines do not see this, it just means they are stupid. As mentioned before, a high Elo is no cure against stupidity. There just aren't any cheap search tricks that can substitute for true knowledge. (Not even expensive ones!)

Endgame fortress handling

Endgame fortress handling

Re: Endgame fortress handling

Re: Endgame fortress handling

Re: Endgame fortress handling

Re: Endgame fortress handling

Re: Endgame fortress handling

Re: Endgame fortress handling

Re: Endgame fortress handling

Re: Endgame fortress handling

Re: Endgame fortress handling