Some handicap results and conclusions.

Discussion of computer chess matches and engine tournaments.

Moderators: hgm, Rebel, chrisw

User avatar
Nordlandia
Posts: 2821
Joined: Fri Sep 25, 2015 9:38 pm
Location: Sortland, Norway

Re: Some handicap results and conclusions.

Post by Nordlandia »

Knight for two pawns ->

https://lichess.org/g9HoAvC2

Next game is knight for three pawns.
Lyudmil Tsvetkov
Posts: 6052
Joined: Tue Jun 12, 2012 12:41 pm

Re: Some handicap results and conclusions.

Post by Lyudmil Tsvetkov »

lkaufman wrote:
Lyudmil Tsvetkov wrote:
lkaufman wrote:
Lyudmil Tsvetkov wrote:[d]r1bqkb1r/pp1p1ppp/2n2n2/4p3/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1

caveat: above, as black has large dynamic advantage, white should play very carefully, so there are just 3 reasonable white first moves, that not only hold, but also give white substantial, probably winning advantage:

- e3
- d3
- c4

testing with just these 3 moves, or leaving the engine with multiple cores to randomise itself should score very favourably for white; using a wider book might very well provide distorted results.

When I stopped my MC test, White was still under 49% after 1208 games at 16 ply. I tried a self-play game with Komodo at 4 min + 2 sec; White opening 1c4, but after 1...d5 Black obviously has an improved version of the White side of the Morra Gambit, which is considered quite equal. Black kept a favorably score for a while, then gradually White pulled it to zero and drew. But one game means nothing, and I don't have time to play the hundred or more needed to determine anything. But I would choose Black if I had to play this position for a lot of money against another player of my level.
The Morra is actually quite bad for white.

no full equality there, maybe white could still hold, but no full equality.

as Jean rightly says, score only increases.
I don't know why would you continue to claim full equality, when top engines would show consistently white edge in almost all games.

maybe at some point, you will be eager to run a long long test, when you understand that this is a real deficiency of top engines.

not having quite time, otherwise would have posted some 100+ similar positions, with even more convincing evaluation failures.
Since there is plenty of data on the Morra gambit, let's talk about that (after 4.Nxc3) rather than your composed position. In the Hiarcs powerbook, mostly strong engine games, White's performance rating is one elo above the opponents' average rating. In my own database of GM games plus correspondence games since Rybka came out, White's performance was six elo below the Opponents' average. Each sample above a thousand games. So maybe it's fair to say that if forced to choose a side, you should choose Black, but a proper eval should be very close to zero, maybe something like -.03 or so, based on these results. I suspect that you would like it to be evaluated -.20 or so, but the data doesn't support this. Your composed position is obviously worse for White than the Black side of the Morra, so it seems clear to me that White would score below 50% in either GM or engine games.
I am interested in this not to argue, but because if you can actually convince me that Black is substantially better in the Morra I might try to modify Komodo accordingly. I don't mind being proven wrong if I can learn from it, but I need hard evidence, not just a couple of games.
I can do nothing to convince you, if you don't want to convince yourself.

I would have run a statistically significant number of games to come of with some meaningful results, but I simply don't have sufficient resources to spare on that.

you don't want to run a larger sample of TC games with Komodo, other users seemingly also don't do that, so I don't know what else could be done.

with this one, the Morra gambit:

[d]rnbqkbnr/pp1ppppp/8/8/4P3/2N5/PP3PPP/R1BQKBNR b KQkq - 0 4

I checked a bit with SF and, as expected, this greatly favours black, score rises to at least 50-60cps black advantage, but in many cases well over 100cps, in some 10 moves or so, I could also play some games, but that would not be statistically significant, so I simply don't know what to do more.

one way or another, such positions are extremely relevant to most of your handicap matches, involving pawn, for example all 2 pawns handicaps, but also N for pawns, where scores might have quite a different size actually.
lkaufman
Posts: 5960
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA

Re: Some handicap results and conclusions.

Post by lkaufman »

Lyudmil Tsvetkov wrote:
lkaufman wrote:
Lyudmil Tsvetkov wrote:
lkaufman wrote:
Lyudmil Tsvetkov wrote:[d]r1bqkb1r/pp1p1ppp/2n2n2/4p3/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1

caveat: above, as black has large dynamic advantage, white should play very carefully, so there are just 3 reasonable white first moves, that not only hold, but also give white substantial, probably winning advantage:

- e3
- d3
- c4

testing with just these 3 moves, or leaving the engine with multiple cores to randomise itself should score very favourably for white; using a wider book might very well provide distorted results.

When I stopped my MC test, White was still under 49% after 1208 games at 16 ply. I tried a self-play game with Komodo at 4 min + 2 sec; White opening 1c4, but after 1...d5 Black obviously has an improved version of the White side of the Morra Gambit, which is considered quite equal. Black kept a favorably score for a while, then gradually White pulled it to zero and drew. But one game means nothing, and I don't have time to play the hundred or more needed to determine anything. But I would choose Black if I had to play this position for a lot of money against another player of my level.
The Morra is actually quite bad for white.

no full equality there, maybe white could still hold, but no full equality.

as Jean rightly says, score only increases.
I don't know why would you continue to claim full equality, when top engines would show consistently white edge in almost all games.

maybe at some point, you will be eager to run a long long test, when you understand that this is a real deficiency of top engines.

not having quite time, otherwise would have posted some 100+ similar positions, with even more convincing evaluation failures.
Since there is plenty of data on the Morra gambit, let's talk about that (after 4.Nxc3) rather than your composed position. In the Hiarcs powerbook, mostly strong engine games, White's performance rating is one elo above the opponents' average rating. In my own database of GM games plus correspondence games since Rybka came out, White's performance was six elo below the Opponents' average. Each sample above a thousand games. So maybe it's fair to say that if forced to choose a side, you should choose Black, but a proper eval should be very close to zero, maybe something like -.03 or so, based on these results. I suspect that you would like it to be evaluated -.20 or so, but the data doesn't support this. Your composed position is obviously worse for White than the Black side of the Morra, so it seems clear to me that White would score below 50% in either GM or engine games.
I am interested in this not to argue, but because if you can actually convince me that Black is substantially better in the Morra I might try to modify Komodo accordingly. I don't mind being proven wrong if I can learn from it, but I need hard evidence, not just a couple of games.
I can do nothing to convince you, if you don't want to convince yourself.

I would have run a statistically significant number of games to come of with some meaningful results, but I simply don't have sufficient resources to spare on that.

you don't want to run a larger sample of TC games with Komodo, other users seemingly also don't do that, so I don't know what else could be done.

with this one, the Morra gambit:

[d]rnbqkbnr/pp1ppppp/8/8/4P3/2N5/PP3PPP/R1BQKBNR b KQkq - 0 4

I checked a bit with SF and, as expected, this greatly favours black, score rises to at least 50-60cps black advantage, but in many cases well over 100cps, in some 10 moves or so, I could also play some games, but that would not be statistically significant, so I simply don't know what to do more.

one way or another, such positions are extremely relevant to most of your handicap matches, involving pawn, for example all 2 pawns handicaps, but also N for pawns, where scores might have quite a different size actually.
I spent a bit of time analyzing both the Morra and your composed position with both latest Komodo and latest Stockfish; here are some findings. First, it is pretty clear that Stockfish is in general more materialistic than Komodo; I'm not sure if this was the point you were trying to make, but it does seem to be so. Whether that is good or bad is harder to say. In the Morra, Komodo gives an eval around -.10 or so with a reasonable search, and if I play the most popular moves for White against what Komodo likes the score seems to stay around there. With Stockfish it is more like -.25 or so in the Morra position but drifts a bit closer to zero if White plays the main theoretical moves. Considering all the results and this analysis, I must admit that the Morra is not fully equal, Black has a slim edge; I think Komodo's eval of around -.10 is about right. I would say that Black's edge in the Morra is smaller than White's edge in the opening position, which is around .15. As for your composed position, the situation is similar; Stockfish likes White much more than Komodo does, and both engines tend to confirm their opinions with more analysis. So it's very hard to tell who is correct. Here too I suspect that you are right about the pawn-up side being the one with the better chances, but again I think the edge is much less than White's normal advantage in chess.
It is possible that the reason Stockfish tends to score well against Komodo is because it is more materialistic, but I don't yet have enough evidence to make this claim or to try to remedy it.
Komodo rules!
Lyudmil Tsvetkov
Posts: 6052
Joined: Tue Jun 12, 2012 12:41 pm

Re: Some handicap results and conclusions.

Post by Lyudmil Tsvetkov »

lkaufman wrote:
Lyudmil Tsvetkov wrote:
lkaufman wrote:
Lyudmil Tsvetkov wrote:
lkaufman wrote:
Lyudmil Tsvetkov wrote:[d]r1bqkb1r/pp1p1ppp/2n2n2/4p3/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1

caveat: above, as black has large dynamic advantage, white should play very carefully, so there are just 3 reasonable white first moves, that not only hold, but also give white substantial, probably winning advantage:

- e3
- d3
- c4

testing with just these 3 moves, or leaving the engine with multiple cores to randomise itself should score very favourably for white; using a wider book might very well provide distorted results.

When I stopped my MC test, White was still under 49% after 1208 games at 16 ply. I tried a self-play game with Komodo at 4 min + 2 sec; White opening 1c4, but after 1...d5 Black obviously has an improved version of the White side of the Morra Gambit, which is considered quite equal. Black kept a favorably score for a while, then gradually White pulled it to zero and drew. But one game means nothing, and I don't have time to play the hundred or more needed to determine anything. But I would choose Black if I had to play this position for a lot of money against another player of my level.
The Morra is actually quite bad for white.

no full equality there, maybe white could still hold, but no full equality.

as Jean rightly says, score only increases.
I don't know why would you continue to claim full equality, when top engines would show consistently white edge in almost all games.

maybe at some point, you will be eager to run a long long test, when you understand that this is a real deficiency of top engines.

not having quite time, otherwise would have posted some 100+ similar positions, with even more convincing evaluation failures.
Since there is plenty of data on the Morra gambit, let's talk about that (after 4.Nxc3) rather than your composed position. In the Hiarcs powerbook, mostly strong engine games, White's performance rating is one elo above the opponents' average rating. In my own database of GM games plus correspondence games since Rybka came out, White's performance was six elo below the Opponents' average. Each sample above a thousand games. So maybe it's fair to say that if forced to choose a side, you should choose Black, but a proper eval should be very close to zero, maybe something like -.03 or so, based on these results. I suspect that you would like it to be evaluated -.20 or so, but the data doesn't support this. Your composed position is obviously worse for White than the Black side of the Morra, so it seems clear to me that White would score below 50% in either GM or engine games.
I am interested in this not to argue, but because if you can actually convince me that Black is substantially better in the Morra I might try to modify Komodo accordingly. I don't mind being proven wrong if I can learn from it, but I need hard evidence, not just a couple of games.
I can do nothing to convince you, if you don't want to convince yourself.

I would have run a statistically significant number of games to come of with some meaningful results, but I simply don't have sufficient resources to spare on that.

you don't want to run a larger sample of TC games with Komodo, other users seemingly also don't do that, so I don't know what else could be done.

with this one, the Morra gambit:

[d]rnbqkbnr/pp1ppppp/8/8/4P3/2N5/PP3PPP/R1BQKBNR b KQkq - 0 4

I checked a bit with SF and, as expected, this greatly favours black, score rises to at least 50-60cps black advantage, but in many cases well over 100cps, in some 10 moves or so, I could also play some games, but that would not be statistically significant, so I simply don't know what to do more.

one way or another, such positions are extremely relevant to most of your handicap matches, involving pawn, for example all 2 pawns handicaps, but also N for pawns, where scores might have quite a different size actually.
I spent a bit of time analyzing both the Morra and your composed position with both latest Komodo and latest Stockfish; here are some findings. First, it is pretty clear that Stockfish is in general more materialistic than Komodo; I'm not sure if this was the point you were trying to make, but it does seem to be so. Whether that is good or bad is harder to say. In the Morra, Komodo gives an eval around -.10 or so with a reasonable search, and if I play the most popular moves for White against what Komodo likes the score seems to stay around there. With Stockfish it is more like -.25 or so in the Morra position but drifts a bit closer to zero if White plays the main theoretical moves. Considering all the results and this analysis, I must admit that the Morra is not fully equal, Black has a slim edge; I think Komodo's eval of around -.10 is about right. I would say that Black's edge in the Morra is smaller than White's edge in the opening position, which is around .15. As for your composed position, the situation is similar; Stockfish likes White much more than Komodo does, and both engines tend to confirm their opinions with more analysis. So it's very hard to tell who is correct. Here too I suspect that you are right about the pawn-up side being the one with the better chances, but again I think the edge is much less than White's normal advantage in chess.
It is possible that the reason Stockfish tends to score well against Komodo is because it is more materialistic, but I don't yet have enough evidence to make this claim or to try to remedy it.
thanks Larry.

at least coming closer.

for my composed position, the advantage should be at least 50cps with perfect play, I don't know if winning.

for the Morra, black advantage should be at least 60cps, I don't know if winning.

at least, those are the scores I get from SF with multiple tries.

how this translates in score in tests I don't know, the stronger side might be winning by some margin, but it might also be closer to 50%, I have no data.

what is clear is stronger side, side with more material, has the advantage, and it is not good to defend all the time in order to draw.

it does not make any sense that the Morra is about equal, after all, white made 2 suboptimal/very much suboptimal moves out of 3, 2.d4, which is still acceptable, and especially 3.c3, the Morra itself, which already throws any white advantage there.

white has the first move, but making 2 bad moves in a row is hardly going to give you a nice position.

I would have liked to prove a forced black win in the Morra, but too much effort.
lkaufman
Posts: 5960
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA

Re: Some handicap results and conclusions.

Post by lkaufman »

Lyudmil Tsvetkov wrote:
lkaufman wrote:
Lyudmil Tsvetkov wrote:
lkaufman wrote:
Lyudmil Tsvetkov wrote:
lkaufman wrote:
Lyudmil Tsvetkov wrote:[d]r1bqkb1r/pp1p1ppp/2n2n2/4p3/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1

caveat: above, as black has large dynamic advantage, white should play very carefully, so there are just 3 reasonable white first moves, that not only hold, but also give white substantial, probably winning advantage:

- e3
- d3
- c4

testing with just these 3 moves, or leaving the engine with multiple cores to randomise itself should score very favourably for white; using a wider book might very well provide distorted results.

When I stopped my MC test, White was still under 49% after 1208 games at 16 ply. I tried a self-play game with Komodo at 4 min + 2 sec; White opening 1c4, but after 1...d5 Black obviously has an improved version of the White side of the Morra Gambit, which is considered quite equal. Black kept a favorably score for a while, then gradually White pulled it to zero and drew. But one game means nothing, and I don't have time to play the hundred or more needed to determine anything. But I would choose Black if I had to play this position for a lot of money against another player of my level.
The Morra is actually quite bad for white.

no full equality there, maybe white could still hold, but no full equality.

as Jean rightly says, score only increases.
I don't know why would you continue to claim full equality, when top engines would show consistently white edge in almost all games.

maybe at some point, you will be eager to run a long long test, when you understand that this is a real deficiency of top engines.

not having quite time, otherwise would have posted some 100+ similar positions, with even more convincing evaluation failures.
Since there is plenty of data on the Morra gambit, let's talk about that (after 4.Nxc3) rather than your composed position. In the Hiarcs powerbook, mostly strong engine games, White's performance rating is one elo above the opponents' average rating. In my own database of GM games plus correspondence games since Rybka came out, White's performance was six elo below the Opponents' average. Each sample above a thousand games. So maybe it's fair to say that if forced to choose a side, you should choose Black, but a proper eval should be very close to zero, maybe something like -.03 or so, based on these results. I suspect that you would like it to be evaluated -.20 or so, but the data doesn't support this. Your composed position is obviously worse for White than the Black side of the Morra, so it seems clear to me that White would score below 50% in either GM or engine games.
I am interested in this not to argue, but because if you can actually convince me that Black is substantially better in the Morra I might try to modify Komodo accordingly. I don't mind being proven wrong if I can learn from it, but I need hard evidence, not just a couple of games.
I can do nothing to convince you, if you don't want to convince yourself.

I would have run a statistically significant number of games to come of with some meaningful results, but I simply don't have sufficient resources to spare on that.

you don't want to run a larger sample of TC games with Komodo, other users seemingly also don't do that, so I don't know what else could be done.

with this one, the Morra gambit:

[d]rnbqkbnr/pp1ppppp/8/8/4P3/2N5/PP3PPP/R1BQKBNR b KQkq - 0 4

I checked a bit with SF and, as expected, this greatly favours black, score rises to at least 50-60cps black advantage, but in many cases well over 100cps, in some 10 moves or so, I could also play some games, but that would not be statistically significant, so I simply don't know what to do more.

one way or another, such positions are extremely relevant to most of your handicap matches, involving pawn, for example all 2 pawns handicaps, but also N for pawns, where scores might have quite a different size actually.
I spent a bit of time analyzing both the Morra and your composed position with both latest Komodo and latest Stockfish; here are some findings. First, it is pretty clear that Stockfish is in general more materialistic than Komodo; I'm not sure if this was the point you were trying to make, but it does seem to be so. Whether that is good or bad is harder to say. In the Morra, Komodo gives an eval around -.10 or so with a reasonable search, and if I play the most popular moves for White against what Komodo likes the score seems to stay around there. With Stockfish it is more like -.25 or so in the Morra position but drifts a bit closer to zero if White plays the main theoretical moves. Considering all the results and this analysis, I must admit that the Morra is not fully equal, Black has a slim edge; I think Komodo's eval of around -.10 is about right. I would say that Black's edge in the Morra is smaller than White's edge in the opening position, which is around .15. As for your composed position, the situation is similar; Stockfish likes White much more than Komodo does, and both engines tend to confirm their opinions with more analysis. So it's very hard to tell who is correct. Here too I suspect that you are right about the pawn-up side being the one with the better chances, but again I think the edge is much less than White's normal advantage in chess.
It is possible that the reason Stockfish tends to score well against Komodo is because it is more materialistic, but I don't yet have enough evidence to make this claim or to try to remedy it.
thanks Larry.

at least coming closer.

for my composed position, the advantage should be at least 50cps with perfect play, I don't know if winning.

for the Morra, black advantage should be at least 60cps, I don't know if winning.

at least, those are the scores I get from SF with multiple tries.

how this translates in score in tests I don't know, the stronger side might be winning by some margin, but it might also be closer to 50%, I have no data.

what is clear is stronger side, side with more material, has the advantage, and it is not good to defend all the time in order to draw.

it does not make any sense that the Morra is about equal, after all, white made 2 suboptimal/very much suboptimal moves out of 3, 2.d4, which is still acceptable, and especially 3.c3, the Morra itself, which already throws any white advantage there.

white has the first move, but making 2 bad moves in a row is hardly going to give you a nice position.

I would have liked to prove a forced black win in the Morra, but too much effort.
I think you have convinced me that Komodo has gone a bit too far in overvaluing factors like mobility vs. factors like material, and I think I've gained a couple elo by modifying parameters. It won't make a drastic difference in the positions we are talking about, but it should move the eval a couple centipawns in the direction you favor. This probably accounts for at least part of the negative results vs. latest Stockfish dev.
Komodo rules!
JJJ
Posts: 1346
Joined: Sat Apr 19, 2014 1:47 pm

Re: Some handicap results and conclusions.

Post by JJJ »

That is great news Larry !
Lyudmil Tsvetkov
Posts: 6052
Joined: Tue Jun 12, 2012 12:41 pm

Re: Some handicap results and conclusions.

Post by Lyudmil Tsvetkov »

lkaufman wrote:
Lyudmil Tsvetkov wrote:
lkaufman wrote:
Lyudmil Tsvetkov wrote:
lkaufman wrote:
Lyudmil Tsvetkov wrote:
lkaufman wrote:
Lyudmil Tsvetkov wrote:[d]r1bqkb1r/pp1p1ppp/2n2n2/4p3/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1

caveat: above, as black has large dynamic advantage, white should play very carefully, so there are just 3 reasonable white first moves, that not only hold, but also give white substantial, probably winning advantage:

- e3
- d3
- c4

testing with just these 3 moves, or leaving the engine with multiple cores to randomise itself should score very favourably for white; using a wider book might very well provide distorted results.

When I stopped my MC test, White was still under 49% after 1208 games at 16 ply. I tried a self-play game with Komodo at 4 min + 2 sec; White opening 1c4, but after 1...d5 Black obviously has an improved version of the White side of the Morra Gambit, which is considered quite equal. Black kept a favorably score for a while, then gradually White pulled it to zero and drew. But one game means nothing, and I don't have time to play the hundred or more needed to determine anything. But I would choose Black if I had to play this position for a lot of money against another player of my level.
The Morra is actually quite bad for white.

no full equality there, maybe white could still hold, but no full equality.

as Jean rightly says, score only increases.
I don't know why would you continue to claim full equality, when top engines would show consistently white edge in almost all games.

maybe at some point, you will be eager to run a long long test, when you understand that this is a real deficiency of top engines.

not having quite time, otherwise would have posted some 100+ similar positions, with even more convincing evaluation failures.
Since there is plenty of data on the Morra gambit, let's talk about that (after 4.Nxc3) rather than your composed position. In the Hiarcs powerbook, mostly strong engine games, White's performance rating is one elo above the opponents' average rating. In my own database of GM games plus correspondence games since Rybka came out, White's performance was six elo below the Opponents' average. Each sample above a thousand games. So maybe it's fair to say that if forced to choose a side, you should choose Black, but a proper eval should be very close to zero, maybe something like -.03 or so, based on these results. I suspect that you would like it to be evaluated -.20 or so, but the data doesn't support this. Your composed position is obviously worse for White than the Black side of the Morra, so it seems clear to me that White would score below 50% in either GM or engine games.
I am interested in this not to argue, but because if you can actually convince me that Black is substantially better in the Morra I might try to modify Komodo accordingly. I don't mind being proven wrong if I can learn from it, but I need hard evidence, not just a couple of games.
I can do nothing to convince you, if you don't want to convince yourself.

I would have run a statistically significant number of games to come of with some meaningful results, but I simply don't have sufficient resources to spare on that.

you don't want to run a larger sample of TC games with Komodo, other users seemingly also don't do that, so I don't know what else could be done.

with this one, the Morra gambit:

[d]rnbqkbnr/pp1ppppp/8/8/4P3/2N5/PP3PPP/R1BQKBNR b KQkq - 0 4

I checked a bit with SF and, as expected, this greatly favours black, score rises to at least 50-60cps black advantage, but in many cases well over 100cps, in some 10 moves or so, I could also play some games, but that would not be statistically significant, so I simply don't know what to do more.

one way or another, such positions are extremely relevant to most of your handicap matches, involving pawn, for example all 2 pawns handicaps, but also N for pawns, where scores might have quite a different size actually.
I spent a bit of time analyzing both the Morra and your composed position with both latest Komodo and latest Stockfish; here are some findings. First, it is pretty clear that Stockfish is in general more materialistic than Komodo; I'm not sure if this was the point you were trying to make, but it does seem to be so. Whether that is good or bad is harder to say. In the Morra, Komodo gives an eval around -.10 or so with a reasonable search, and if I play the most popular moves for White against what Komodo likes the score seems to stay around there. With Stockfish it is more like -.25 or so in the Morra position but drifts a bit closer to zero if White plays the main theoretical moves. Considering all the results and this analysis, I must admit that the Morra is not fully equal, Black has a slim edge; I think Komodo's eval of around -.10 is about right. I would say that Black's edge in the Morra is smaller than White's edge in the opening position, which is around .15. As for your composed position, the situation is similar; Stockfish likes White much more than Komodo does, and both engines tend to confirm their opinions with more analysis. So it's very hard to tell who is correct. Here too I suspect that you are right about the pawn-up side being the one with the better chances, but again I think the edge is much less than White's normal advantage in chess.
It is possible that the reason Stockfish tends to score well against Komodo is because it is more materialistic, but I don't yet have enough evidence to make this claim or to try to remedy it.
thanks Larry.

at least coming closer.

for my composed position, the advantage should be at least 50cps with perfect play, I don't know if winning.

for the Morra, black advantage should be at least 60cps, I don't know if winning.

at least, those are the scores I get from SF with multiple tries.

how this translates in score in tests I don't know, the stronger side might be winning by some margin, but it might also be closer to 50%, I have no data.

what is clear is stronger side, side with more material, has the advantage, and it is not good to defend all the time in order to draw.

it does not make any sense that the Morra is about equal, after all, white made 2 suboptimal/very much suboptimal moves out of 3, 2.d4, which is still acceptable, and especially 3.c3, the Morra itself, which already throws any white advantage there.

white has the first move, but making 2 bad moves in a row is hardly going to give you a nice position.

I would have liked to prove a forced black win in the Morra, but too much effort.
I think you have convinced me that Komodo has gone a bit too far in overvaluing factors like mobility vs. factors like material, and I think I've gained a couple elo by modifying parameters. It won't make a drastic difference in the positions we are talking about, but it should move the eval a couple centipawns in the direction you favor. This probably accounts for at least part of the negative results vs. latest Stockfish dev.
thanks for the info, Larry.

main eval feature accounting for the discrepancy in score is the compact pawn structure of the side with more material/pawns.

but one can never introduce such a feature in one's engine, unless one systematically defines all the compact pawns: twice defended, twice aligned, defended aligned, long chain, etc., necessarily fully psqtised.

a single missing link might break everything down.
Lyudmil Tsvetkov
Posts: 6052
Joined: Tue Jun 12, 2012 12:41 pm

Re: Some handicap results and conclusions.

Post by Lyudmil Tsvetkov »

lkaufman wrote:This probably accounts for at least part of the negative results vs. latest Stockfish dev.
maybe, but certainly less than 5%.

SF has one major advantage over Komodo, at least what concerns very short TC, 1min. bullet or so, and that is much better king attack.

from the multiple games I have been seeing between both, Komodo loses 80% of the games or even more due to compromised king safety/excellent SF king attacks.

what are the reasons behind it is anyone's guess, my pick would be:

- better SF king attack evaluation
- not fully satisfactory(at least for its top level) Komodo king safety(I don't know what Komodo king safety includes, but it still likes to play with its g3 doubled pawn, part of the shelter, which SF exploits relentlessly)
- certainly, deeper SF search accounts for its better understanding of king safety
- much better SF understanding of connected pawns structures; these often help with getting the necessary positional advantage, before starting the decisive attack

I guess with your latest versions you have made some progress in this respect, but I don't believe the overal relation changed dramatically.
Lyudmil Tsvetkov
Posts: 6052
Joined: Tue Jun 12, 2012 12:41 pm

Re: Some handicap results and conclusions.

Post by Lyudmil Tsvetkov »

I don't know why I am posting this, Larry might be angry, but the doubled g3 pawn is so typical of Komodo, that I thought it is worth a post.

here is a game between Komodo 10.1 as white and SF:

[pgn][Event "OWNER-PC, Blitz 1m"]
[Site "Microsoft"]
[Date "2017.06.07"]
[Round "226"]
[White "Komodo 10.1 64-bit"]
[Black "Stockfish 8 64 POPCNT"]
[Result "0-1"]
[ECO "D35"]
[Annotator "0.27;0.29"]
[PlyCount "126"]
[TimeControl "60"]

{Intel(R) Core(TM) i7 CPU Q 740 @ 1.73GHz 1729 MHz W=15.1 plies; 1
191kN/s; CM8000.ctg B=17.3 plies; 1 234kN/s; CM8000.ctg} 1. d4 {B 0} Nf6 {B 0
} 2. c4 {B 0} e6 {B 0 Both last book move} 3. Nf3 {0.27/16 2} d5 {0.29/19 2} 4.
Nc3 {0.25/17 1} Be7 {0.40/18 2 (c5)} 5. cxd5 {0.39/17 4 (Bf4)} exd5 {0.12/17 1}
6. Bf4 {0.38/18 1 (e3)} O-O {0.26/18 2} 7. e3 {0.34/18 1 (h3)} Nh5 {0.12/18 1
(Bf5)} 8. Be5 {0.48/16 1} f6 {0.16/19 1} 9. Bg3 {0.37/17 1} c6 {0.24/17 0 (g6)}
10. Bd3 {0.42/17 1} g6 {0.14/17 1} 11. O-O {0.26/17 2 (e4)} Nxg3 {-0.18/17 1
(Bg4)} 12. hxg3 {0.38/18 1} f5 {-0.26/16 0 (Bg4)} 13. Rb1 {0.44/15 1} a6 {-0.
11/20 2 (Nd7)} 14. Qb3 {0.47/16 1} a5 {-0.20/18 1 (Nd7)} 15. Rbc1 {0.46/15 1
(Ne5)} Nd7 {-0.25/18 1} 16. Ne5 {0.31/16 2} Nf6 {-0.46/20 3 (Kg7)} 17. Na4 {0.
30/14 1} Bd6 {-0.43/17 0 (Kg7)} 18. Rfd1 {0.20/14 2} Re8 {-0.48/18 1 (Ne4)} 19.
a3 {0.09/16 3 (Nf3)} b5 {-0.72/18 1 (Rb8)} 20. Nc5 {0.28/16 1} Bxe5 {-0.55/19
0 (a4)} 21. dxe5 {0.28/18 1} Rxe5 {0.00/21 3 (a4)} 22. Qc3 {0.23/16 1} Re8 {-0.
27/22 1} 23. b4 {0.24/17 1 (Nb3)} a4 {-0.51/17 1 (Ne4)} 24. Be2 {0.19/14 1} Qe7
{-0.37/15 0 (Nd7)} 25. Rd4 {0.25/15 1 (Qd4)} h5 {-0.87/17 1 (Ra7)} 26. Bd1 {0.
21/16 1 (Rd2)} Rb8 {-1.01/18 1} 27. Qc2 {0.14/17 1 (Be2)} Rb6 {-0.98/19 1 (Nd7)
} 28. Bf3 {0.14/19 1 (Be2)} Ne4 {-1.04/19 1} 29. Qa2 {0.13/20 1 (Rdd1)} Qf7 {
-1.08/20 3 (Nxc5)} 30. Be2 {0.13/21 1 (Qc2)} Nd6 {-1.08/21 1 (Nxc5)} 31. Bf3 {
0.13/19 1 (Qb2)} Be6 {-1.18/20 0 (Ne4)} 32. Qe2 {0.13/15 1 (Be2)} Re7 {-1.32/
18 1 (Ne4)} 33. Qd1 {0.13/15 1 (Rc2)} Qf6 {-1.39/18 1 (Re8)} 34. Qd2 {0.13/18
1 (Be2)} Bf7 {-1.39/20 1 (Re8)} 35. Rd1 {0.12/15 1 (Be2)} g5 {-1.33/22 1 (Be6)}
36. Be2 {0.05/15 1} Bg6 {-1.32/22 3 (Ne4)} 37. Rc1 {0.00/15 1} Rb8 {-1.52/19 1
(Ne4)} 38. Na6 {0.00/17 1 (Qb2)} Rc8 {-1.58/21 0 (Rb6)} 39. Nc5 {-0.37/16 2
(Qc3)} Rg7 {-1.53/19 1 (Rce8)} 40. Bd3 {0.00/17 1 (Qc3)} Rh7 {-1.70/16 1 (Re7)}
41. Qd1 {-0.21/16 1 (Be2)} g4 {-1.92/16 0 (Rf8)} 42. Rf4 {-0.27/15 0} Qe7 {-1.
45/17 1 (Rd8)} 43. Qc2 {-0.65/14 0 (Bb1)} Rf8 {-1.91/16 1 (h4)} 44. Qc3 {-0.60/
14 1} h4 {-1.87/17 1} 45. gxh4 {-0.75/15 1 (Re1)} Qxh4 {-2.73/16 0 (Rxh4)} 46.
Kf1 {-0.50/15 0} d4 {-2.92/15 0 (Qe7)} 47. Qd2 {-2.15/14 1 (Qe1)} dxe3 {-3.65/
16 0} 48. fxe3 {-2.08/15 0} Re8 {-3.62/17 2} 49. Rd1 {-2.23/13 1 (Ke2)} Qg5 {
-4.81/15 0 (Qh1+)} 50. g3 {-2.55/12 0 (Bb1)} Rh3 {-4.23/16 1} 51. Bc2 {-3.10/
13 0 (Be2)} Nc4 {-5.89/16 0} 52. Rxc4 {-3.28/14 0} bxc4 {-5.77/14 0} 53. Bxa4 {
-3.41/14 0} Qh6 {-6.33/14 0 (Rh1+)} 54. Bxc6 {-3.24/12 0} Rxg3 {-6.38/12 0} 55.
Re1 {-3.89/12 1 (Bg2)} Qh3+ {-7.35/14 0 (Bf7)} 56. Bg2 {-4.23/11 0} Rf3+ {-6.
69/13 0} 57. Kg1 {-4.68/11 0} Qg3 {-8.29/16 1} 58. Rf1 {-4.83/12 0 (Ra1)} Rexe3
{-9.46/14 0 (Rxf1+)} 59. Kh1 {-5.79/11 0 (Qd1)} Rxf1+ {-10.59/14 0} 60. Bxf1 {
-5.79/5 0} Qf4 {-10.84/14 0 (Re1)} 61. Qd1 {-6.93/12 0} Bf7 {-13.08/15 0} 62.
b5 {-7.84/12 0} Kg7 {-20.71/15 0 (Rxa3)} 63. Bg2 {-10.83/11 0} Qf2 {-24.45/13 0
} 0-1

[/pgn]

[d]rnbq1rk1/pp2b2p/2p3p1/3p1p2/3P4/2NBPNP1/PP3PP1/R2Q1RK1 w - - 0 13

SF 20cps black advantage, Komodo 40cps white edge.

quite probably, white is already lost.

see how SF fixes the g3 doubled shelter weakness, and then h5-h4 uses the weakness to open lines for decisive attack.

I guess tuning engines against only their predecessors is a bit shaky concept, as Komodo playing against its predecessor might never utilise effectively the g3 weakness, no matter the statistically significant number of games.

but then, the doubled g3 shelter pawn is maybe Komodo's pet flaw. :)
Lyudmil Tsvetkov
Posts: 6052
Joined: Tue Jun 12, 2012 12:41 pm

Re: Some handicap results and conclusions.

Post by Lyudmil Tsvetkov »

another one.

this probably has to do with depth and move ordering/general search.

[pgn][Event "OWNER-PC, Blitz 1m"]
[Site "Microsoft"]
[Date "2017.06.12"]
[Round "3"]
[White "Stockfish 8 64 POPCNT"]
[Black "Komodo 10.1 64-bit"]
[Result "1-0"]
[ECO "C01"]
[Annotator "0.34;0.09"]
[PlyCount "99"]
[TimeControl "60"]

{Intel(R) Core(TM) i7 CPU Q 740 @ 1.73GHz 1729 MHz W=17.7 plies; 1
189kN/s; Empty.ctg B=16.9 plies; 1 296kN/s; Empty.ctg} 1. e4 {0.34/19 6} e6 {
0.09/16 2} 2. d4 {0.32/19 1} d5 {0.20/16 1} 3. Nc3 {0.05/20 3} Bb4 {0.17/17 2}
4. exd5 {0.20/19 1} exd5 {0.20/17 1} 5. Bd3 {0.12/19 1} Nc6 {0.21/17 1 (Nf6)}
6. Nf3 {0.24/19 2} Nf6 {0.25/18 3} 7. O-O {0.27/16 0 (a3)} a6 {0.16/17 1 (0-0)}
8. Re1+ {0.43/18 1 (a3)} Be7 {0.32/16 1} 9. a3 {0.35/20 2 (Bf4)} O-O {0.26/16 0
} 10. h3 {0.31/17 0} Bd6 {0.21/18 1 (h6)} 11. Bg5 {0.43/18 1} Be6 {0.24/17 1}
12. Ne2 {0.46/17 0 (Qd2)} h6 {0.07/14 0} 13. Bh4 {0.78/19 4 (Bf4)} g5 {0.02/15
1 (Rb8)} 14. Nxg5 {1.20/17 1 (Bg3)} hxg5 {0.00/16 1} 15. Bxg5 {0.66/16 0} Re8 {
0.06/17 2} 16. c3 {0.87/18 1} Bd7 {0.17/16 2} 17. Ng3 {0.83/21 1} Rxe1+ {0.32/
16 1} 18. Qxe1 {0.91/20 0} Qf8 {0.33/17 1 (Qe7)} 19. Qe3 {0.41/19 2 (Bxf6)} Ne8
{0.33/16 1} 20. Nh5 {0.67/19 1 (Qf3)} f5 {0.28/16 1} 21. Bf4 {0.71/19 3 (Qf3)}
Qf7 {0.21/16 1 (Be7)} 22. Qg3+ {0.68/17 1} Ng7 {0.31/18 1} 23. Be2 {1.03/19 2
(Bxd6)} Re8 {0.00/18 1} 24. Bxd6 {0.80/22 2 (Bf3)} Rxe2 {-0.18/16 0} 25. Nf4 {
0.97/19 0} Rxb2 {0.39/16 2 (Rc2)} 26. Bxc7 {1.47/16 1} Rb3 {0.32/16 1 (Be8)}
27. Re1 {1.76/16 0} Nxd4 {0.51/17 1 (a5)} 28. Be5 {2.42/20 2} Nc6 {0.38/17 0}
29. Qg5 {2.20/18 0 (Nxd5)} Rxa3 {0.42/15 1} 30. Qh6 {2.54/20 1 (Bf6)} Nxe5 {0.
37/16 1 (Ra1)} 31. Rxe5 {3.37/16 0} Ne8 {0.43/18 1 (Bc6)} 32. Re3 {4.11/16 0
(Nxd5)} Qg7 {3.13/14 1} 33. Rg3 {4.53/15 0} Ra1+ {3.63/16 1} 34. Kh2 {4.53/1 0}
Qxg3+ {3.97/18 2} 35. fxg3 {4.88/16 0} Re1 {4.01/18 1} 36. Nxd5 {4.96/16 0} a5
{4.11/18 2 (Be6)} 37. Qg6+ {5.01/20 1} Kf8 {4.09/18 0} 38. Qh7 {5.10/19 0} Ng7
{4.36/19 0} 39. Qh8+ {5.36/19 0} Kf7 {4.36/5 0} 40. Qd8 {5.55/19 0} Bc6 {4.30/
22 1} 41. Qf6+ {5.58/17 0} Kg8 {4.20/23 0 (Ke8)} 42. Ne7+ {5.90/16 0} Rxe7 {4.
13/24 1} 43. Qxe7 {6.35/18 0} a4 {4.59/23 1} 44. h4 {6.44/16 0 (Kg1)} Kh7 {4.
58/22 1} 45. Kh3 {6.91/18 1 (Qf7)} Kg8 {4.84/17 1 (Kh6)} 46. g4 {7.88/14 0
(Qb4)} fxg4+ {6.26/15 1} 47. Kxg4 {8.36/15 0} a3 {6.59/16 1 (Be8)} 48. Qxa3 {
8.88/15 0} Kf7 {6.67/16 1} 49. h5 {9.44/14 0 (Qa2+)} Ne8 {6.85/13 0 (Ke6)} 50.
Kg5 {9.87/14 0} 1-0

[/pgn]

[d]r2q1rk1/1pp2p2/p1nbbn2/3p2B1/3P4/P2B3P/1PP1NPP1/R2QR1K1 b - - 0 15

SF 70cps white edge, Komodo 0.0.

this is quite indicative.

most SF wins are like this.