Even crazier

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Semi, but real

Post by bob »

Lyudmil Tsvetkov wrote:
bob wrote:In your first diagram, how is that a pawn NOT weak? If it stays where it is, it can't be defended by a pawn. If it advances, it becomes an instant isolani on an open file...

It is blocked just as surely as it would be by a ram in front of it here. You are trying to get into dynamic chess terms and evaluate them statically. I don't think the a-pawn's value (either positive or negative) can be determined without a search to see whether it can be attacked and won, or if there is some advantage in advancing it.
Not all pawns that are not defended by another pawn are weak, they could be defended by pieces, etc. One and the same pawn could be at the same time backward, weak, unopposed, isolated, etc. Unopposed is just one additional quality of the pawn that provieds added value. Why consider just some features of a pawn and not consider other. The unopposed quality of the pawn will receive bonus for ability to advance easier; with a rook behind, for example, you could open with it a file, where otherwise it would be impossible or more difficult.

If there are no terms whose real value could be determined without an extensive search, why have evaluation at all?

Why have passer evaluation, when you could just stick to piece material values?
Then why have potential passers, is not this redundant? I think you might know better than me, but I am sure potential passers had come into wider use much later than passers, am I right? And many still do not do them. If you do potential passers, why not do unopposed pawns too? They are just a more sophisticated expression of the same idea. Without trying, many would have thought there is no added value to potential passers, but then they see it is not like that.

If you do potential passers, why not do semi-backward pawns? The relation of passers to potential passers is almost the same as the one of backward pawns to semi-backward pawns. It is absolutely the same. Semis are just extension of the idea. And both are real. If you have added value with potential passers, quite probably you will have added value also with semi-backward pawns.

Well, another example.

[D]8/3n2k1/5n2/1P6/2Pb4/1Q6/6K1/8 w - - 0 1

I already posted this, suggesting that in deciding that the position is a draw, one must have to resort to a special eval term that I would call intensity of interaction, i.e., giving some bonus for squares mutually controlled by pieces. In endgames like that, the mutual control of squares of pieces of lower power, especially in front of passers, is very important. I do not know if some book author has already mentioned this term, the name of the term is not of significance, but the fact that the term is real and represents existing chess knowledge.

How are you going to make your engine know this ending is drawn, if you do not have some special refined term to guide it? Search is very difficult to help here, so what are you going to do? Those are real weaknesses of engines, so why close our eyes to them?
You can't make the evaluation code arbitrarily complex, or you will get tactically crushed. Evaluating this as a draw is very difficult, because change the pawns just a bit and it is not so drawish any longer. You can't do the quick and dirty 3 pieces vs Q+PP is drawn, because there are lots of positions where it is not a draw. Screw up many of those and you lose games you should draw, because you choose to reach a position your eval says draw but it is wrong. A few of those and your Elo drops. This is dynamic stuff, it doesn't belong in the static evaluation.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Semi, but real

Post by bob »

Advance both pawns 2 ranks, make it black to move. What now?
syzygy
Posts: 5569
Joined: Tue Feb 28, 2012 11:56 pm

Re: Semi, but real

Post by syzygy »

Lyudmil Tsvetkov wrote:
syzygy wrote:
Lyudmil Tsvetkov wrote:
syzygy wrote:
Lyudmil Tsvetkov wrote:[D]6k1/1pp5/1p6/8/PPP5/8/8/6K1 w - - 0 1
Do not you see, an unopposed pawn could be worth because of its ability to advance freely. Instead of assigning just penalty for double, you could assign smaller penalty for double and also a small bonus for unopposed, in this way the chess logic will be better reflected.
Do you not see that this is just adding redudant terms to the evaluation?
It is quite the opposite, it is adding refined terms.
One more double pawn for one side means one more unopposed pawn for the other side... It is redundant.
The idea would be to assess every single meaningful detail in the position, instead of just big bonus obvious terms. (...)
Hardly a new idea. This is how evaluation functions have worked since forever.
There is nothing new here, it is just that the suggestions look a bit strange, because untested and unfamiliar.
I don't see anything new in the "apex" and "unopposed" concepts. Evaluation functions already take such aspects into account.

But if you agree that nothing is new here, why argue that chess programming should start from scratch with a new paradigm (or at least agree with those that argue this)?
Quite probably, within the Stockfish environment, introducing a new term would be very difficult to test without changing significantly the structure and tuning many parameters simultaneously. Quite probably, superficial testing with such a term will fail, but that is a real and significant term. (...)
This is part of what the Stockfish developers are doing all the time.
You really amaze me.
You ask me why consider passers when we have already considered the pawn as material that is a passer at the same time?
What are you talking about?
If you have odd number of pawns, one more unopposed pawn would not add another double one.
Obviously keep the number of white/black pawns the same...
If terms like apex and unopposed pawns are already taken into account, why do you consider them as new, strange and unnecessary, redundant?? (sorry for the 2 question marks, but I am really amazed at how the discussion proceeds)
If they are already taken into account, then adding a term for them leads to a redundant term. Is this difficult?

As I wrote earlier:
syzygy wrote:And I should add that before trying this out, it must be determined whether the concept is not already implicit in other pawn evaluation terms. If it is, a new term would merely slow down parameter tuning by adding a redundant parameter.
Why not test a reasonable idea, and test many unreasonable ones? Where is the added value to that?
You really think developers are too busy testing unreasonable ideas to test reasonable ones?
Finally, I want also a very concrete answer. On the above diagram you see a position that is drawn. How many engines would see this? Does your engine see it? If not, is not this a failure of the engine? What would be wrong with introducing an eval term, like I have suggested, named intensity of interaction (or whatever you call it), giving bonus for the squares mutually controlled by pieces.
Picking out one position where an engine gets it wrong and devising an evaluation rule for that position may improve the engine, but usually does not. Of course for a primitive engine it is far more likely to work than for an engine like Stockfish.

Maybe someone who knows Stockfish's eval better than I do and who can decide whether your proposal would add something non-redundant to its current evaluation terms will try it out. This is just the normal course of business, nothing revolutionary, no new paradigm.

One thing should be clear: getting it right on any specific position is certainly not the criterion that the Stockfish team or any other developer of a top engine will apply, because it would be a very bad criterion.
In similar endgames, such a term, especially the squares mutually controlled by pieces of lower power in front of enemy passers, would be very useful and allow the engine to see the draw. What would be wrong with such a suggestion? Or maybe it is wrong, because I have suggested it, and not someone else, I do not know.
What I am objecting to is your "it is obvious, don't you see, it is very useful".
In any case, do not you see, at least here, the real added value?
Just more of the "don't you see"...

There can only be added value if it is not alreay there, explicitly or disguised. Why do you think it is not there? Did you study Stockfish's evaluation?
Lyudmil Tsvetkov
Posts: 6052
Joined: Tue Jun 12, 2012 12:41 pm

Re: Semi, but real

Post by Lyudmil Tsvetkov »

bob wrote:Advance both pawns 2 ranks, make it black to move. What now?
Obviously you are not reading what I am writing at all, because I specifically said mutual control of squares in front of enemy passers. When you advance both pawns 2 ranks, such control already does not exist for neither pawn.
Lyudmil Tsvetkov
Posts: 6052
Joined: Tue Jun 12, 2012 12:41 pm

Re: Semi, but real

Post by Lyudmil Tsvetkov »

bob wrote: You can't make the evaluation code arbitrarily complex, or you will get tactically crushed. Evaluating this as a draw is very difficult, because change the pawns just a bit and it is not so drawish any longer. You can't do the quick and dirty 3 pieces vs Q+PP is drawn, because there are lots of positions where it is not a draw. Screw up many of those and you lose games you should draw, because you choose to reach a position your eval says draw but it is wrong. A few of those and your Elo drops. This is dynamic stuff, it doesn't belong in the static evaluation.
Following this logic, everything is dynamic stuff.
So, we will never have an engine knowing this endgame is a draw.
And many engines will have lost half a point, because they have rejected a winning line with smaller score and instead chosen this very promising in score line, that however proves to be drawn.
syzygy
Posts: 5569
Joined: Tue Feb 28, 2012 11:56 pm

Re: Semi, but real

Post by syzygy »

Lyudmil Tsvetkov wrote:So, we will never have an engine knowing this endgame is a draw.
Be so kind and formulate a concrete evaluation rule that detects that position as a draw and does not fail on other very similar positions that are not a draw.

Of course it is easy to add a test for that specific position... I hope we don't have to explain that such an approach would be going nowhere.
Lyudmil Tsvetkov
Posts: 6052
Joined: Tue Jun 12, 2012 12:41 pm

Re: Semi, but real

Post by Lyudmil Tsvetkov »

syzygy wrote:
Lyudmil Tsvetkov wrote:So, we will never have an engine knowing this endgame is a draw.
Be so kind and formulate a concrete evaluation rule that detects that position as a draw and does not fail on other very similar positions that are not a draw.

Of course it is easy to add a test for that specific position... I hope we don't have to explain that such an approach would be going nowhere.
Please, look here: http://www.talkchess.com/forum/viewtopi ... 24&t=49357

Clamor in deserto.
Lyudmil Tsvetkov
Posts: 6052
Joined: Tue Jun 12, 2012 12:41 pm

Re: Semi, but real

Post by Lyudmil Tsvetkov »

But, of course, if you would like a very specific rule, it can run something like that:
At least for endgames with an imbalance of queen versus rooks and minor pieces, with separate or straddled passers for the queen side, score intensity of interaction 10 times higher than usual. The usual scoring would be 1/100 the cumulative values of the pieces of lower power for the squares they mutually control. For such specific endings you will count necessarily the squares controlled in front of the enemy passers (on any rank in front of those passers). Thus, a rook and bishop controlling mutually a square in front of an enemy passer, the additional bonus for intensity of interaction would be 10x1/100x(4.5+3)pawns = 75cps. For 2 minors mutually controlling such a square the additional bonus would be some 60cps.

To be very much specific not to miss something particular, please specify that intensity of interaction would be counted also for squares controlled by one of the pieces of lower power, where another piece of lower power is located. For the specific case of queen versus 2 rooks, x-ray intensity of interaction for squares controlled by one of the rooks on a rank where both rooks are located will also be counted, with the x-ray value being 1/2 that of the normal one, i.e. a rook controlling a square and another rook controlling the same square in front of an enemy passer on an x-ray would score 10x1/100(4.5+2.25) = 62.5cps

In this way, instead of showing highly unrealistic scores, an engine would be able to evaluate properly at least the following positions and derivatives:

[D]6k1/3r1p2/3P2p1/5bP1/3Q1P2/8/8/6K1 w - - 0 1
Additional 75cps for the d7 square

[D]8/3n2k1/5n2/1P6/2Pb4/1Q6/6K1/8 w - - 0 1
Additional 120cps for the c5 and b6 squares, and maybe another 120 or so for the d7 and f6 squares

[D]8/6k1/2r1r3/8/P5P1/3Q4/6K1/8 w - - 0 1
Additional 125cps for the a6 and g6 squares

[D]5b2/3n1k2/6r1/8/2P5/4Q2P/2K2P2/8 w - - 0 1
Additional 60cps for c5 + 75cps for f6 + 75cps for h6 = 210cps

If you try to calculate the new scores, you will see they will be far more realistic (and users will be happy with that), in some cases pretty close to zero, instead of +2 or 3 full pawns advantage for the queen side, and in others a straight zero.

Are you happy with my suggestion now? Or something is again completely wrong?
Lyudmil Tsvetkov
Posts: 6052
Joined: Tue Jun 12, 2012 12:41 pm

Re: Semi, but real

Post by Lyudmil Tsvetkov »

And, of course, I forgot to mention, that all of the posted positions are supposed to be drawn, and you would not expect an end user to be happy when a tytanically strong engine shows here a convincing winning score for one of the sides.

Those are weaknesses in engines, and they are real.
Lyudmil Tsvetkov
Posts: 6052
Joined: Tue Jun 12, 2012 12:41 pm

Re: The phantom of closure

Post by Lyudmil Tsvetkov »

I just happened to play 2 winning games against Stockfish today, that somehow seem to illustrate the validity of some of the rules formulated by me, although many would look scornfully upon them. Usual motives are repeated in one of the games, while in the other one you can judge by yourself if the phantom of closure is real.

[pgn][PlyCount "61"]
[Event "Blitz 2m+2s"]
[Site "Sofia"]
[Date "2013.09.30"]
[White "Tsvetkov, Lyudmil"]
[Black "Stockfish 4 64 SSE4.2"]
[Result "1-0"]
[ECO "D00"]
[TimeControl "120+2"]
[Annotator "Tsvetkov,Lyudmil"]
[MLNrOfMoves "30"]
[MLFlags "100100"]

{1024MB, Dell XPS 4Cores} 1. e3 {[%emt 0:00:00]} 1... e6 {[%emt 0:00:08]} 2. f4
{[%emt 0:00:02]} 2... Nf6 {[%emt 0:00:06]} 3. Nf3 {[%emt 0:00:02]} 3... Be7
{[%emt 0:00: 04]} 4. d4 {[%emt 0:00:02]} 4... d5 {[%emt 0:00:06]} 5. Bd3
{[%emt 0:00:02]} 5... c5 {[%emt 0:00:05]} 6. c3 {[%emt 0:00:03]} 6... O-O
{[%emt 0:00:04]} 7. O-O {[%emt 0:00: 03]} 7... Bd7 {[%emt 0:00:10]} 8. Nbd2
{[%emt 0:00:06]} 8... Qc7 {[%emt 0:00:07]} 9. Ne5 {[%emt 0:00:02]} 9... a6
{[%emt 0:00:03]} 10. a4 {[%emt 0:00:37]} 10... Nc6 {[%emt 0:00: 04]} 11. g4
{[%emt 0:00:04]} 11... g6 {[%emt 0:00:15]} 12. g5 {[%emt 0:00:06]} 12... Nxe5
{[%emt 0:00:04]} 13. fxe5 {[%emt 0:00:24]} 13... Nh5 {[%emt 0:00:04]} 14. Rf6
{[%emt 0:00:07]} 14... Nxf6 {[%emt 0:00:06]} 15. gxf6 {[%emt 0:00:01]} 15... Bd8
{[%emt 0:00:12]} 16. Nf3 {[%emt 0:00:14]} 16... Kh8 {[%emt 0:00:11]} 17. Qe1
{[%emt 0:00:09]} 17... c4 {[%emt 0:00:04]} 18. Bc2 {[%emt 0:00:04]} 18... Bxf6
{[%emt 0:00:02]} 19. exf6 {[%emt 0:00:05]} 19... Qd8 {[%emt 0:00:00]} 20. Qh4
{[%emt 0:00:09]} 20... Bc6 {[%emt 0:00:02]} 21. Ne5 {[%emt 0:00:52]} 21... Kg8
{[%emt 0:00:00]} 22. Bd2 {[%emt 0:00:13]} 22... Kh8 {[%emt 0:00:01]} 23. Rf1
{[%emt 0:00:06]} 23... Be8 {[%emt 0:00:01]} 24. Rf3 {[%emt 0: 00:18]} 24... Kg8
{[%emt 0:00:00]} 25. Rh3 {[%emt 0:00:57]} 25... h5 {[%emt 0:00:04]} 26. Qg5
{[%emt 0:00:09]} 26... a5 {[%emt 0:00:00]} 27. Rxh5 {[%emt 0:00:12]} 27... Qxf6
{[%emt 0:00:01]} 28. Qxf6 {[%emt 0:00:02]} 28... gxh5 {[%emt 0:00:00]} 29. Qg5+
{[%emt 0:00: 05]} 29... Kh8 {[%emt 0:00:00]} 30. Qh6+ {[%emt 0:00:03]} 30... Kg8
{[%emt 0:00:00]} 31. Qh7# {[%emt 0:00:02]} 1-0

[PlyCount "71"]
[Event "Blitz 2m+2s"]
[Site "Sofia"]
[Date "2013.09.30"]
[White "Tsvetkov, Lyudmil"]
[Black "Stockfish 4 64 SSE4.2"]
[Result "1-0"]
[ECO "C00"]
[TimeControl "120+2"]
[Annotator "Tsvetkov,Lyudmil"]
[MLNrOfMoves "35"]
[MLFlags "000100"]

{1024MB, Dell XPS 4Cores} 1. e4 {[%emt 0:00:00]} 1... e6 {[%emt 0:00:05]} 2. d3
{[%emt 0:00:02]} 2... d5 {[%emt 0:00:05]} 3. Nc3 {[%emt 0:00:03]} 3... d4
{[%emt 0:00:03]} 4. Nce2 {[%emt 0:00:03]} 4... c5 {[%emt 0:00:04]} 5. f4
{[%emt 0:00:02]} 5... Nf6 {[%emt 0:00:06]} 6. h3 {[%emt 0:00:02]} 6... Be7
{[%emt 0:00:08]} 7. g4 {[%emt 0:00:02]} 7... O-O {[%emt 0:00:05]} 8. Bg2
{[%emt 0:00:03]} 8... Nc6 {[%emt 0:00:05]} 9. Nf3 {[%emt 0:00:04]} 9... Nd7
{[%emt 0:00:10]} 10. O-O {[%emt 0:00:05]} 10... b5 {[%emt 0:00: 06]} 11. a4
{[%emt 0:00:17]} 11... b4 {[%emt 0:00:00]} 12. b3 {[%emt 0:00:03]} 12... Bb7
{[%emt 0:00:05]} 13. Qe1 {[%emt 0:00:11]} 13... Qc7 {[%emt 0:00:11]} 14. h4
{[%emt 0: 00:20]} 14... Rad8 {[%emt 0:00:08]} 15. Qg3 {[%emt 0:00:07]} 15... Bd6
{[%emt 0:00:11]} 16. Bd2 {[%emt 0:00:12]} 16... f6 {[%emt 0:00:00]} 17. Bh3
{[%emt 0:00:28]} 17... Rc8 {[%emt 0: 00:09]} 18. Rf2 {[%emt 0:00:11]} 18... Rce8
{[%emt 0:00:06]} 19. Kh1 {[%emt 0:01:41]} 19... Qd8 {[%emt 0:00:06]} 20. Rg1
{[%emt 0:00:06]} 20... g6 {[%emt 0:00:14]} 21. Rh2 {[%emt 0:00:58]} 21... Qa8
{[%emt 0:00:11]} 22. Bg2 {[%emt 0:00:11]} 22... e5 {[%emt 0:00: 04]} 23. f5
{[%emt 0:00:05]} 23... gxf5 {[%emt 0:00:05]} 24. gxf5+ {[%emt 0:00:05]} 24...
Kh8 {[%emt 0:00:04]} 25. Ng5 {[%emt 0:00:11]} 25... Re7 {[%emt 0:00:02]} 26. Qf3
{[%emt 0:00:09]} 26... fxg5 {[%emt 0:00:03]} 27. hxg5 {[%emt 0:00:01]} 27... c4
{[%emt 0:00: 06]} 28. bxc4 {[%emt 0:00:20]} 28... b3 {[%emt 0:00:00]} 29. cxb3
{[%emt 0:00:04]} 29... Nb4 {[%emt 0:00:04]} 30. Rh6 {[%emt 0:00:40]} 30... Nxd3
{[%emt 0:00:10]} 31. Qxd3 {[%emt 0:00:11]} 31... Nc5 {[%emt 0:00:00]} 32. Qh3
{[%emt 0:00:25]} 32... Nxe4 {[%emt 0:00: 00]} 33. Qh4 {[%emt 0:02:29]} 33...
Rxf5 {[%emt 0:00:00]} 34. g6 {[%emt 0:03:03]} 34... Nf2+ {[%emt 0:00:00]} 35.
Kh2 {[%emt 0:00:08]} 35... e4+ {[%emt 0:00:00]} 36. Bf4 {[%emt 0:00:30]} 1-0
[/pgn]

Some diagrams to illustrate the main points:

[D]r4rk1/1pqbbp1p/p3p1p1/2ppP1Pn/P2P4/2PBP3/1P1N3P/R1BQ1RK1 w - - 0 14
Do not you see the black f7 backward-fated pawn is real? It makes the king shelter extremely inflexible

[D]r4rk1/1pqbbp1p/p3pRp1/2ppP1Pn/P2P4/2PBP3/1P1N3P/R1BQ2K1 b - - 0 14
This time Stockfish has 2 capturing choices at f6, but that does not help in the least.

[D]r2b1rk1/1pqb1p1p/p3pPp1/2ppP3/P2P4/2PBP3/1P1N3P/R1BQ2K1 w - - 0 16
Do not you see the b2-f6 chain is tremendous and provides added value?

[D]r2qbrk1/1p3p2/p3pPp1/3pN1Qp/P1pP4/2P1P2R/1PBB3P/6K1 b - - 0 26
You would not believe Stockfish plays the black side of the game.

[D]r1bq1rk1/p2nbppp/2n1p3/2p5/Pp1pPPP1/1P1P1N1P/2P1N1B1/R1BQ1RK1 b - - 0 12
Stockfish wrongly closes the last remaining unclosed file on the queen side, with white having superiority on the other side (still not in terms of space advantage, but in terms of attacking opportunities, a couple of storming pawns already started moving, g4,f4). This is a major mistake, and one could say that, if black does not manage to somehow transfer its king to the queen side on time, black is already positionally lost. How many engines would think so? Do not you think the phantom of closure is real?

[D]q3rr1k/pb1n3p/2nb1p2/2p1pPN1/Pp1pP2P/1P1P2Q1/2PBN1BR/6RK b - - 0 25
Not only Stockfish is able to impress with glossy tricks.

[D]q6k/pb2r2p/3b2PR/5r2/P1PppB1Q/1P6/4NnBK/6R1 b - - 0 36
Not very bad game to have been played by a human. Here I still do not see how the game will end, but Stockfish suddenly resigned.

What do you think about those games, do not you think the phantom of closure is real?

PS. I am sorry for posting again games with Stockfish, I am sure Marco and co will not be angry, but simply currently Stockfish is my main sparring partner.