opposite castling armageddon

Michel · Post by **Michel** » Sun Jul 28, 2019 3:27 pm

Laskos wrote: ↑Sun Jul 28, 2019 2:48 pm
Michel wrote: ↑Sun Jul 28, 2019 2:22 pm Well I was assuming drawishness between identical or at least very similar engines. This can be done with the Texel tuning method. I think it would not be particularly difficult.
Even in these very particular conditions, how many parameters are there in Texel tuning method, aside eval? Another goal of that method is to make the mapping of the eval to score for the whole game via a logistic. And it fails to do the same mapping for openings and endgames, if my experiments were not that off. And as soon as you deviate from a "Texel engine at a particular time control and hardware", those too few parameters have to be changed. It is good for self-training and self-testing, but one can hardly generalize.

In the Texel tuning method, it is the eval that is tuned.

In an A/B engine the eval is a (usually) linear combination of features and in the Texel tuning method the coefficients are tuned in such a way that over a given large body of positions the resulting eval matches as well as possible the observed outcome of those positions (win, draw, loss) (via a logistic function, there are a number of variants, depending on the loss function, scoring of draws, etc...).

Instead one can also ask for "draw eval", a linear combination of features which matches as well as possible the draw/non-draw outcomes on the training positions. This can be tuned in the same way. Features that are likely relevant are game stage, king safety, passed pawns...

lkaufman · Post by **lkaufman** » Sun Jul 28, 2019 5:45 pm

pohl4711 wrote: ↑Sun Jul 28, 2019 10:04 am
lkaufman wrote: ↑Sun Jul 28, 2019 5:59 am It seems that the results and opinions posted here generally indicate that my proposal does not give White enough advantage to fully offset draw odds, at least with engine play. Therefore I'll propose instead the variant I mentioned in which White has normal castling options but Black has only the queenside castling option, plus draw odds. An opening sequence to produce this would be 1.Na3 Nh6 2.Nb5 Rg8 3.Nc3 Rh8 4.Nb1 Ng8. This is obviously more favorable for White than the initial proposal, but probably not a lot more favorable, as White would usually choose to castle short anyway. Lc0 analysis has White's percentage expectancy rising a couple points, to 70-72% range, which is about right for draw odds. A short shootout I ran this way had White winning 7 wins to 3 draws (= losses). It has the extra merit of allowing for both same and opposite castling, like normal chess, although opposite castling will be much more common. Perhaps it needs a different name though. "Black no short castling Armaggedon" is a bit long!
I did a experiment with normal castling options for white and no castlings allowed for black. That seems quite OK. In my testsings, asmFish scored 58.4% vs. Komodo 10.4 (after 205 games played)

Compared to the results below, this is not bad (but not very good, too.). More experiments are needed. I will continue working on this interesting idea. Another option could be: One pawn material advantage for white (1. Na3 a6 2. Nb1 a5 3. Na3 a4 4. Nb1 a3 5. Nxa3 Na6 6. Nb1 Nb8) - but when doing this, all openings have to be checked by an engine first (when the opening line moves the white queen to a4 for example, the queen can be captured by black rook from a8...).

Test results:
(asmFish 170426 vs. Komodo 10.4, 5'+3'' time-control, singlecore, no ponder, no endgame-bases, LittleBlitzerGUI, 1000 games each testrun(!) except Noomen Gambit-lines (only 246 positions, so 492 games were played) and Noomen TCEC Superfinal (only 100 positions, so 200 games were played))

Stockfish Framework standard 8 move openings: Score 60.3% – 39.7%, draws: 63.4%
FEOBOS v20 contempt 5 top 500 openings: Score 58.7% - 41.3%, draws: 64.1%
HERT 500 set: Score: 60.6% - 39.4%, draws: 60.4%
Noomen Gambit-Lines: Score 59.1% - 40.9%, draws: 59.3%
4 GM-moves short book: Score 60.5% - 39.5%, draws: 57.1%
Noomen TCEC Superfinal (Season 9+10): Score: 62.5% - 37.5%, draws: 50.0%
SALC V5 half-closed: Score 61.6% - 38.4%, draws: 49.2%
SALC V5 full-closed 500 positions: Score 66.5% - 33.5%, draws: 47.7%
Drawkiller (normal set): Score: 65.3% - 34.7%, draws: 33.5%
Drawkiller (tournament set): Score: 65.3% - 34.7%, draws: 33.5%
(no mistake by me: the results of Drawkiller normal and tournament were exactly
the same after 1000 played games!)
Drawkiller (small 500 positions set): Score: 66.4% - 33.6%, draws 30.5%

To tell whether it's roughly fair, I think you need to replay the asmfish vs komodo test with colors reversed and compare the results.
Regarding removing one White pawn with White getting draw odds, I did some tests on that idea recently. Removing the edge pawns is way too small a handicap; with a2 removed White is not much worse off than Black is in normal chess. Removing f2 or g2 is too clearly winning for Black. But removing b2,c2,d2, or e2 seems to be just about right for roughly even chances with White getting draw odds. But I think this solution is much less appealing than the Black-cannot-castle kingside solution; the castling solution feels much more like normal chess.

pohl4711 · Post by **pohl4711** » Mon Jul 29, 2019 5:39 am

lkaufman wrote: ↑Sun Jul 28, 2019 5:45 pm
pohl4711 wrote: ↑Sun Jul 28, 2019 10:04 am
lkaufman wrote: ↑Sun Jul 28, 2019 5:59 am It seems that the results and opinions posted here generally indicate that my proposal does not give White enough advantage to fully offset draw odds, at least with engine play. Therefore I'll propose instead the variant I mentioned in which White has normal castling options but Black has only the queenside castling option, plus draw odds. An opening sequence to produce this would be 1.Na3 Nh6 2.Nb5 Rg8 3.Nc3 Rh8 4.Nb1 Ng8. This is obviously more favorable for White than the initial proposal, but probably not a lot more favorable, as White would usually choose to castle short anyway. Lc0 analysis has White's percentage expectancy rising a couple points, to 70-72% range, which is about right for draw odds. A short shootout I ran this way had White winning 7 wins to 3 draws (= losses). It has the extra merit of allowing for both same and opposite castling, like normal chess, although opposite castling will be much more common. Perhaps it needs a different name though. "Black no short castling Armaggedon" is a bit long!
I did a experiment with normal castling options for white and no castlings allowed for black. That seems quite OK. In my testsings, asmFish scored 58.4% vs. Komodo 10.4 (after 205 games played)

Compared to the results below, this is not bad (but not very good, too.). More experiments are needed. I will continue working on this interesting idea. Another option could be: One pawn material advantage for white (1. Na3 a6 2. Nb1 a5 3. Na3 a4 4. Nb1 a3 5. Nxa3 Na6 6. Nb1 Nb8) - but when doing this, all openings have to be checked by an engine first (when the opening line moves the white queen to a4 for example, the queen can be captured by black rook from a8...).

Test results:
(asmFish 170426 vs. Komodo 10.4, 5'+3'' time-control, singlecore, no ponder, no endgame-bases, LittleBlitzerGUI, 1000 games each testrun(!) except Noomen Gambit-lines (only 246 positions, so 492 games were played) and Noomen TCEC Superfinal (only 100 positions, so 200 games were played))

Stockfish Framework standard 8 move openings: Score 60.3% – 39.7%, draws: 63.4%
FEOBOS v20 contempt 5 top 500 openings: Score 58.7% - 41.3%, draws: 64.1%
HERT 500 set: Score: 60.6% - 39.4%, draws: 60.4%
Noomen Gambit-Lines: Score 59.1% - 40.9%, draws: 59.3%
4 GM-moves short book: Score 60.5% - 39.5%, draws: 57.1%
Noomen TCEC Superfinal (Season 9+10): Score: 62.5% - 37.5%, draws: 50.0%
SALC V5 half-closed: Score 61.6% - 38.4%, draws: 49.2%
SALC V5 full-closed 500 positions: Score 66.5% - 33.5%, draws: 47.7%
Drawkiller (normal set): Score: 65.3% - 34.7%, draws: 33.5%
Drawkiller (tournament set): Score: 65.3% - 34.7%, draws: 33.5%
(no mistake by me: the results of Drawkiller normal and tournament were exactly
the same after 1000 played games!)
Drawkiller (small 500 positions set): Score: 66.4% - 33.6%, draws 30.5%
To tell whether it's roughly fair, I think you need to replay the asmfish vs komodo test with colors reversed and compare the results.
Regarding removing one White pawn with White getting draw odds, I did some tests on that idea recently. Removing the edge pawns is way too small a handicap; with a2 removed White is not much worse off than Black is in normal chess. Removing f2 or g2 is too clearly winning for Black. But removing b2,c2,d2, or e2 seems to be just about right for roughly even chances with White getting draw odds. But I think this solution is much less appealing than the Black-cannot-castle kingside solution; the castling solution feels much more like normal chess.

In my testings asmFish vs. Komodo 10.4 of course half of the games, asmfish plays white and the other half asmfish plays black...
In a 1000 games testrun, asmfish play 500xwhite and 500x black. What else? (I use LittleBlitzerGUI, which always switch colors of engines after each game in a 2 engines head-to-head testrun).

A first pre-Alpha test with removed a7- pawn (combined with SF Framework 8moves openings) looks very promising! Much better Elo-spreading, than the “forbidden castlings“ solution. Lets wait and see... Work continues...
And a deleted a7-pawn of course feels much more than normal chess, compared to black not allowed to castle short or black not allowed to castle long&short...because, even if black may castle long, engines often prefer not to castle.

lkaufman · Post by **lkaufman** » Mon Jul 29, 2019 5:24 pm

pohl4711 wrote: ↑Mon Jul 29, 2019 5:39 am
lkaufman wrote: ↑Sun Jul 28, 2019 5:45 pm
pohl4711 wrote: ↑Sun Jul 28, 2019 10:04 am
lkaufman wrote: ↑Sun Jul 28, 2019 5:59 am It seems that the results and opinions posted here generally indicate that my proposal does not give White enough advantage to fully offset draw odds, at least with engine play. Therefore I'll propose instead the variant I mentioned in which White has normal castling options but Black has only the queenside castling option, plus draw odds. An opening sequence to produce this would be 1.Na3 Nh6 2.Nb5 Rg8 3.Nc3 Rh8 4.Nb1 Ng8. This is obviously more favorable for White than the initial proposal, but probably not a lot more favorable, as White would usually choose to castle short anyway. Lc0 analysis has White's percentage expectancy rising a couple points, to 70-72% range, which is about right for draw odds. A short shootout I ran this way had White winning 7 wins to 3 draws (= losses). It has the extra merit of allowing for both same and opposite castling, like normal chess, although opposite castling will be much more common. Perhaps it needs a different name though. "Black no short castling Armaggedon" is a bit long!
I did a experiment with normal castling options for white and no castlings allowed for black. That seems quite OK. In my testsings, asmFish scored 58.4% vs. Komodo 10.4 (after 205 games played)

Compared to the results below, this is not bad (but not very good, too.). More experiments are needed. I will continue working on this interesting idea. Another option could be: One pawn material advantage for white (1. Na3 a6 2. Nb1 a5 3. Na3 a4 4. Nb1 a3 5. Nxa3 Na6 6. Nb1 Nb8) - but when doing this, all openings have to be checked by an engine first (when the opening line moves the white queen to a4 for example, the queen can be captured by black rook from a8...).

Test results:
(asmFish 170426 vs. Komodo 10.4, 5'+3'' time-control, singlecore, no ponder, no endgame-bases, LittleBlitzerGUI, 1000 games each testrun(!) except Noomen Gambit-lines (only 246 positions, so 492 games were played) and Noomen TCEC Superfinal (only 100 positions, so 200 games were played))

Stockfish Framework standard 8 move openings: Score 60.3% – 39.7%, draws: 63.4%
FEOBOS v20 contempt 5 top 500 openings: Score 58.7% - 41.3%, draws: 64.1%
HERT 500 set: Score: 60.6% - 39.4%, draws: 60.4%
Noomen Gambit-Lines: Score 59.1% - 40.9%, draws: 59.3%
4 GM-moves short book: Score 60.5% - 39.5%, draws: 57.1%
Noomen TCEC Superfinal (Season 9+10): Score: 62.5% - 37.5%, draws: 50.0%
SALC V5 half-closed: Score 61.6% - 38.4%, draws: 49.2%
SALC V5 full-closed 500 positions: Score 66.5% - 33.5%, draws: 47.7%
Drawkiller (normal set): Score: 65.3% - 34.7%, draws: 33.5%
Drawkiller (tournament set): Score: 65.3% - 34.7%, draws: 33.5%
(no mistake by me: the results of Drawkiller normal and tournament were exactly
the same after 1000 played games!)
Drawkiller (small 500 positions set): Score: 66.4% - 33.6%, draws 30.5%
To tell whether it's roughly fair, I think you need to replay the asmfish vs komodo test with colors reversed and compare the results.
Regarding removing one White pawn with White getting draw odds, I did some tests on that idea recently. Removing the edge pawns is way too small a handicap; with a2 removed White is not much worse off than Black is in normal chess. Removing f2 or g2 is too clearly winning for Black. But removing b2,c2,d2, or e2 seems to be just about right for roughly even chances with White getting draw odds. But I think this solution is much less appealing than the Black-cannot-castle kingside solution; the castling solution feels much more like normal chess.
In my testings asmFish vs. Komodo 10.4 of course half of the games, asmfish plays white and the other half asmfish plays black...
In a 1000 games testrun, asmfish play 500xwhite and 500x black. What else? (I use LittleBlitzerGUI, which always switch colors of engines after each game in a 2 engines head-to-head testrun).

A first pre-Alpha test with removed a7- pawn (combined with SF Framework 8moves openings) looks very promising! Much better Elo-spreading, than the “forbidden castlings“ solution. Lets wait and see... Work continues...
And a deleted a7-pawn of course feels much more than normal chess, compared to black not allowed to castle short or black not allowed to castle long&short...because, even if black may castle long, engines often prefer not to castle.

Maybe I'm missing something, but I don't see anywhere where you give the White vs. Black score for games played with my proposed castling and draw rules? Any short openings ok, as long as both sides can still castle under normal rules at the end of book, otherwise the castling condition makes little sense. But of course in reality opening play would be quite different with these rules.
Regarding removing a7, it feels to me too clearly winning for White, but I haven't tested this, so I could be wrong. White is a significant advantage in chess, and to remove any Black pawn should push this advantage well past the winning margin I would think.

pohl4711 · Post by **pohl4711** » Tue Jul 30, 2019 7:22 am

lkaufman wrote: ↑Mon Jul 29, 2019 5:24 pm
Regarding removing a7, it feels to me too clearly winning for White, but I haven't tested this, so I could be wrong. White is a significant advantage in chess, and to remove any Black pawn should push this advantage well past the winning margin I would think.

Yes, it seems hard to believe. But I am doing a testrun right now and after more than 400 games, asmFish scores more than 71% versus Komodo 10.4, (in my so called PawnPlus-Armageddon), which is a really impressive Elo-spreading. Much better, than I expected (with my testing-conditions, around 60%-40% is the spreading in non-Armageddon, using normal openings, of asmFish vs. Komodo).
Mention, that, if white would have a too clear advantage, there would be a lot of 1:1 pair-results (one opening is won both times for white), so the score of asmFish vs Komodo would be 1:1 (50%), because each engine play white one time. And these 1:1 (=2x white-wins) openings are pushing the score of asmFish and Komodo towards 50%-50%. So, it is clear, that the mssing a7-pawn is not a too clear advantage for white. It seems to be perfect for Armageddon-engine play, otherwise a score of asmFish of more than 71% vs. Komodo is impossible to explain.
The white-score in the PawnPlus-Armangeddon testrun is 59.8% at the moment (with Armageddon-rescoring) and without Armageddon-rescoring, the white score is 78.7% (draws 37.7%). Directly white wins (same value, with or without Armageddon-rescoring): 59.8% of all games.

And I am using openings, checked by Komodo with the pgnscanner in an eval-interval of [+0.50;+1.60], which is a very wide range. Because, it is a first, experimental testrun. There is the opportunity, to use smaller eval-intervals for better results. Perhaps [+0.70;+1.30] or something like that. Or a smaller advantage for white (perhaps [+0.50;+1.20] or so...)
More testruns are needed...but the PawnPlus-Armageddon (missing a7-pawn) seems very, very promising! And, by the way, it is much more natural chess, than forbidding castling(s) for black and/or white. In PawnPlus-Armageddon, all castlings are allowed.

mclane · Post by **mclane** » Tue Jul 30, 2019 7:54 am

The programs play draws because they are similar and have no idea what chess is and what the target of chess is: to mate the opponent.
Instead they shuffle the pieces arround , calculate until depth 40 and if no material can be won they make draw.

They also have no idea about why one side should castle or why opposite castle could be the beginning of an attack.

We don’t need to change the rules of the game or evaluate wins or draws different.
All that has to be done is to teach the engines chess.
Obviously today the programmers optimise search depth.

This has IMO not much to do with chess.
This concept will lose against the neural nets sooner or later. Depending on the hardware, sooner.

lkaufman · Post by **lkaufman** » Tue Jul 30, 2019 4:13 pm

mclane wrote: ↑Tue Jul 30, 2019 7:54 am The programs play draws because they are similar and have no idea what chess is and what the target of chess is: to mate the opponent.
Instead they shuffle the pieces arround , calculate until depth 40 and if no material can be won they make draw.

They also have no idea about why one side should castle or why opposite castle could be the beginning of an attack.

We don’t need to change the rules of the game or evaluate wins or draws different.
All that has to be done is to teach the engines chess.
Obviously today the programmers optimise search depth.

This has IMO not much to do with chess.
This concept will lose against the neural nets sooner or later. Depending on the hardware, sooner.

The flaw in this argument is that when NN engines play each other the percentage of draws is even higher than normal, almost ridiculously high. Draws are due to the rules of the game, not to the way the engines play it.

lkaufman · Post by **lkaufman** » Tue Jul 30, 2019 4:19 pm

pohl4711 wrote: ↑Tue Jul 30, 2019 7:22 am
lkaufman wrote: ↑Mon Jul 29, 2019 5:24 pm
Regarding removing a7, it feels to me too clearly winning for White, but I haven't tested this, so I could be wrong. White is a significant advantage in chess, and to remove any Black pawn should push this advantage well past the winning margin I would think.
Yes, it seems hard to believe. But I am doing a testrun right now and after more than 400 games, asmFish scores more than 71% versus Komodo 10.4, (in my so called PawnPlus-Armageddon), which is a really impressive Elo-spreading. Much better, than I expected (with my testing-conditions, around 60%-40% is the spreading in non-Armageddon, using normal openings, of asmFish vs. Komodo).
Mention, that, if white would have a too clear advantage, there would be a lot of 1:1 pair-results (one opening is won both times for white), so the score of asmFish vs Komodo would be 1:1 (50%), because each engine play white one time. And these 1:1 (=2x white-wins) openings are pushing the score of asmFish and Komodo towards 50%-50%. So, it is clear, that the mssing a7-pawn is not a too clear advantage for white. It seems to be perfect for Armageddon-engine play, otherwise a score of asmFish of more than 71% vs. Komodo is impossible to explain.
The white-score in the PawnPlus-Armangeddon testrun is 59.8% at the moment (with Armageddon-rescoring) and without Armageddon-rescoring, the white score is 78.7% (draws 37.7%). Directly white wins (same value, with or without Armageddon-rescoring): 59.8% of all games.

And I am using openings, checked by Komodo with the pgnscanner in an eval-interval of [+0.50;+1.60], which is a very wide range. Because, it is a first, experimental testrun. There is the opportunity, to use smaller eval-intervals for better results. Perhaps [+0.70;+1.30] or something like that. Or a smaller advantage for white (perhaps [+0.50;+1.20] or so...)
More testruns are needed...but the PawnPlus-Armageddon (missing a7-pawn) seems very, very promising! And, by the way, it is much more natural chess, than forbidding castling(s) for black and/or white. In PawnPlus-Armageddon, all castlings are allowed.

Your results with no a7 are about what I expected. I consider a 60 to 40% score for White to be way too lopsided for use as Armageddon. I think if you like the pawnplus armageddon idea, you will find that removing b2,c2,d2, or e2 (with White obviously getting the draw odds) will give results much closer to 50-50%. Also it gives 4x as much variety. Although the edge pawn is clearly worth less than these other pawns, having the first move is more important.
What was the White win percentage with my no-kingside-castling-for-Black rule?

mclane · Post by **mclane** » Tue Jul 30, 2019 8:49 pm

lkaufman wrote: ↑Tue Jul 30, 2019 4:13 pm
mclane wrote: ↑Tue Jul 30, 2019 7:54 am The programs play draws because they are similar and have no idea what chess is and what the target of chess is: to mate the opponent.
Instead they shuffle the pieces arround , calculate until depth 40 and if no material can be won they make draw.

They also have no idea about why one side should castle or why opposite castle could be the beginning of an attack.

We don’t need to change the rules of the game or evaluate wins or draws different.
All that has to be done is to teach the engines chess.
Obviously today the programmers optimise search depth.

This has IMO not much to do with chess.
This concept will lose against the neural nets sooner or later. Depending on the hardware, sooner.
The flaw in this argument is that when NN engines play each other the percentage of draws is even higher than normal, almost ridiculously high. Draws are due to the rules of the game, not to the way the engines play it.

In opposite to AB programs , NN+MCTS Engines do something. They don’t do it always straight. But it’s completely different then the way normal AB engines play against each other. There you have your draw battles. Endless battles where both side reach 30-40 searches deep and do. It find anything to gain material advantage.

Not the rules are wrong. The chess engines play wrong chess. Instead of finding or planning for a mate , they increase advantage.
That works. But if the opponent also searches that deep, it produces draws.

Therefor the future is to teach them plans instead teaching them getting material advantage.

The target is mate. Not material advantage.

Ovyron · Post by **Ovyron** » Wed Jul 31, 2019 2:55 am

mclane wrote: ↑Tue Jul 30, 2019 8:49 pm There you have your draw battles. Endless battles where both side reach 30-40 searches deep and do. It find anything to gain material advantage.

I've seen very often engines that sacrifice material advantage for positional/dynamical/attacking advantage (I think this started with Rybka beta, who gave away her pawns as if they were rubbish.)

opposite castling armageddon

Re: opposite castling armageddon

Re: opposite castling armageddon

Re: opposite castling armageddon

Re: opposite castling armageddon

Re: opposite castling armageddon

Re: opposite castling armageddon

Re: opposite castling armageddon

Re: opposite castling armageddon

Re: opposite castling armageddon

Re: opposite castling armageddon