Blackmageddon Openings released

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

User avatar
pohl4711
Posts: 2435
Joined: Sat Sep 03, 2011 7:25 am
Location: Berlin, Germany
Full name: Stefan Pohl

Blackmageddon Openings released

Post by pohl4711 »

The Blackmageddon Openings – the future of Computerchess

Final chapter of development (from SALC to Drawkiller to Armageddon to Blackmageddon) and my legacy...

Idea, development and testings by Stefan Pohl

On my website:
https://www.sp-cc.de/blackmageddon-openings.htm

Direct download:
https://www.sp-cc.de/files/blackmageddon.zip

Why is Blackmageddon the end of development of my opening-sets? Because the main goal was, to lower the draw-rate in Computerchess (which climbs more and more, the faster the machines get and the stronger the engines play). And doing this, while keeping a good Elo-spreading of the results in engine-tests and -tournaments. And with Blackmageddon, I reached the goal, perfectly:
The number of draws is exactly 0, so the draw-rate is 0% - it cannot be lowered anymore. The Elo-spreading of the results is incredible high (around doubled to normal openings, on one level with Drawkiller openings). And the whitescore and blackscore is much more stable and balanced, than in Armageddon (that is the huge problem of Armageddon openings), which is the reason, I canceled the Armageddon-openings and removed them from my website.
And Blackmageddon is not a sub set of chess (only SALC openings for example (kings on the opposite side of the chessboard), like in my SALC-openings or the Armageddon-concept by Larry Kaufman) and the Blackmageddon openings are not virtual (like Drawkiller).
So, I see no way, to get any better results than Blackmageddon...

What is Blackmageddon? Blackmageddon means, that the following 4 moves line is set in front of normal chess openings (5 moves (10 plies) out of human games of the Megabase (both players 2300 Elo or more):
1. a4 Nc6 2. a5 Nxa5 3. Na3 Nc6 4. Nb1 Nb8
which means, that black is always one pawn ahead (white has no pawn on a2). And all draws are counted as a win for white.
Here an example of a full Blackmageddon opening-line:
1. a4 Nc6 2. a5 Nxa5 3. Na3 Nc6 4. Nb1 Nb8 5. d4 Nf6 6. c4 e6 7. Nf3 d5 8. Nc3 Be7 9. Bg5 h6
(all endpositions of the lines were evaluated by Komodo 13.1 (30''/position on Quadcore))

Armageddon is the opposite: White has an advantage (black has no a7-pawn or black is not allowed to castle or black is not allowed to castle short). And all draws are counted as a win for black.
Why is Armageddon bad and why is Blackmageddon better?
Because in Armageddon white has two advantages and black none: White has the advantage of the first move and the advantage given by the Armageddon opening. The problem is, that these two advantages let the whitescore climb and climb, when the thinking-time gets longer (or the machine gets faster). And a too high whitescore (and a too low blackscore) means, that the Elo-spreading of the results gets smaller and smaller (if the whitescore is (for example – worst case scenario) 100%, the Elo-spreading is 0, because all engine head-to-heads end 50%-50%, because white wins all games).
In Blackmageddon, white has the advantage of the first move, but black has the advantage of being one pawn ahead. That makes the whitescore/blackscore – balance much more stable in my testings, than in Armageddon. So Blackmageddon will work in the future on faster machines properly. Armageddon will not. And Blackmageddon is not a sub set of chess: All castlings are allowed for both sides.

Of course, the engines do not know, that they are playing Blackmageddon, when using Blackmageddon openings. But that is no problem. You just should set the contempt for all engines to 0 or very close to 0. Then, the engine, which plays white, has a huge negative contempt, because black has one pawn more and the evaluation of the engine is clearly negative. And the engine, which plays black, has a huge positive contempt, because black has one pawn more and the evaluation of the engine is clearly positive. So white will try to reach a forced draw and black will try to avoid it...

I added some tools to the Blackmageddon download, which convert a result.pgn-file with played games to Blackmageddon (blackmageddonize_classical.bat) and one tool, that shows a live-Blackmageddon scoring out of a result pgn-file (livescoring_classical.bat) (can be used, while the GUI still runs a tournament). Means: All 1/2-1/2 results are changed to 1-0. And the livescoring-tool starts ORDO and prints a ratinglist of the blackmageddonized games on the screen. If your engine-games are stored in file with a different name (not results.pgn), just change the name with an editor in the .bat-files (use Search & replace).
And I added two tools, which double all games, won by white, so ORDO or bayeselo count all white wins as 2 points. And all 1/2-1/2 as a 1-0 (like classical): blackmageddon_advanced and livescoring_advanced. The idea was, that in classical Blackmageddon, a draw and a white-win are counted with the same score (1-0). If all white-wins are scoring 2 points, they are „worth more“, than a draw.
But in my testings, the difference of classical and advanced counting was very small. So, I see no need to use advanced scoring, but feel free to use it. But mention, that the number of games gets higher, because all white win-games are doubled (but in Blackmageddon white wins are pretty rare, of course (because black is one pawn ahead, when the game starts). That shrinks the errorbar a little bit, when ORDO does its calculation.
Because of this, all my testing results of Blackmageddon below are done with classical counting. And, as you can see below, the results are overwhelming.

Test results:



(asmFish 170426 vs. Komodo 10.4, 5'+3'' time-control, singlecore, no ponder, no endgame-bases, LittleBlitzerGUI, 1000 games each testrun(!) except Noomen Gambit-lines (only 246 positions, so 492 games were played) and Noomen TCEC Superfinal (only 100 positions, so 200 games were played)). First score: asmFish, second score Komodo.

Stockfish Framework standard 8 move openings: Score 60.3% – 39.7%, draws: 63.4%
FEOBOS v20 contempt 5 top 500 openings: Score 58.7% - 41.3%, draws: 64.1%
HERT 500 set: Score: 60.6% - 39.4%, draws: 60.4%
Noomen Gambit-Lines: Score 59.1% - 40.9%, draws: 59.3%
4 GM-moves short book: Score 60.5% - 39.5%, draws: 57.1%
Noomen TCEC Superfinal (Season 9+10): Score: 62.5% - 37.5%, draws: 50.0%
SALC V5 half-closed: Score 61.6% - 38.4%, draws: 49.2%
SALC V5 full-closed 500 positions: Score 66.5% - 33.5%, draws: 47.7%
Drawkiller (normal set): Score: 65.3% - 34.7%, draws: 33.5%
Drawkiller (tournament set): Score: 65.3% - 34.7%, draws: 33.5%
Drawkiller (small 500 positions set): Score: 66.4% - 33.6%, draws 30.5%
Drawkiller balanced (small 500 positions set): Score: 69.3% - 30.7%, draws 35.2%
Drawkiller balanced: Score 69.4% - 30.6%, draws 36.4%
Drawkiller balanced big (15962 positions): Score 67.4% - 32.6%, draws 38.8%
Drawkiller EloZoom (small 500 positions set): Score: 72.0% - 28.0%, draws 34.6%
Drawkiller EloZoom: Score: 73.2% - 26.8%, draws 36.5%
Drawkiller EloZoom big (20043 positions): Score: 69.2% - 30.8%, draws 40.7%

NEW:

Blackmageddon (500 positions set): Score 70.1% - 29.9%, draws 0% (whitescore: 52.9%)
Blackmageddon (5000 positions set): Score 70.1% - 29,9%, draws 0% (whitescore: 53.3%)
Blackmageddon (10000 positions set): Score 69.5% - 30.5%, draws 0% (whitescore: 54.5%)

Conclusion:

Blackmageddon is the end of my openings development: 0% draws. And a fantastic Elo-spreading of the results: Nearly doubled (around 70%-30%), compared to classical opening-sets like FEOBOS, HERT or Stockfish Framework openings (all around 60%-40%). Mention, that a doubled Elo-spreading means, you have to play only (around) 25% amount of games, to get the results of an engine-test or engine-tournament out of the errorbar. Because you have to play around 4x more games for a half-sized errorbar!

To be clear here: Blackmageddon is not just about avoiding draws and killing the draw-death of computerchess. It is much more: When using Blackmageddon openings, you get the same statistical stability of the engines rankings in an engine-tournament or engine-ratinglist, with playing only 25% of the number of games, you have to play when using a classical openings-set. So, you need only 25% of time on your PC for the same quality of results/rankings. Or, you can play with 4x more thinking-time instead for higher quality chess.

And the number of draws is 0. Always. How awesome is that?

And this is, why Blackmageddon is my legacy and the end of my development of openings-sets for computerchess. So, the journey, which started with the first SALC-openings in 2015 ends here.
I want to thank Hauke Lutz, which helped me a lot in these years, building SALC and Drawkiller and who built the Drawkiller EloZoom openings.
And I want to thank Larry Kaufman, who had the idea of building Armageddon-openings for computerchess (opening-lines, which give a measureable advantage for white and count all draws as a win for black). Even though, this idea is not working so well, because of unstable and climbing whitescores, Blackmageddon is a further development of this idea. And in an evolutionary process, each step on the stairway is built on the step below...

Enjoy Blackmageddon!

(C) 2019 Stefan Pohl (SPCC)


Additional 3 engines-test:

3 engines played a RoundRobin (Stockfish 10, Houdini 6 and Komodo 12), with 500 games in each head-to-head, so each engine played 1000 games. For each game one opening-line was chosen per random by the LittleBlitzerGUI.
Singlecore, 3'+1'', LittleBlitzerGUI, no ponder, no bases, 256 MB Hash, i7-6700HQ 2.6GHz Notebook (Skylake CPU), Windows 10 64bit

Here the same result for Blackmageddon: 0% draws (of course) and the best Elo-spreading of all openings-sets...

Blackmageddon (5000 positions set):

Code: Select all

     Program                Elo    +    -   Games   Score   Av.Op.  Draws
   1 Stockfish 10 bmi2    : 3421   12   12  1000    73.7 %   3239    0.0 %
   2 Houdini 6 pext       : 3270   10   10  1000    44.1 %   3315    0.0 %
   3 Komodo 12 bmi2       : 3209   11   11  1000    32.2 %   3345    0.0 %
Elo-spreading (1st to last): 212 Elo
Draws: 0% (whitescore 54.7%)


Drawkiller balanced:

Code: Select all

     Program                Elo    +    -   Games   Score   Av.Op.  Draws  
   1 Stockfish 10 bmi2    : 3506   11  11   1000    70.9 %   3347   36.2 %
   2 Houdini 6 pext       : 3392   11  11   1000    48.5 %   3404   40.8 %
   3 Komodo 12 bmi2       : 3302   11  11   1000    30.6 %   3449   36.6 %
Elo-spreading (1st to last):204 Elo
Draws:37.9%


Drawkiller tournament:

Code: Select all

     Program                Elo    +    -   Games   Score   Av.Op.  Draws
   1 Stockfish 10 bmi2    : 3494   11  11   1000    68.9 %   3353   34.2 %
   2 Houdini 6 pext       : 3387   11  11   1000    47.3 %   3407   38.2 %
   3 Komodo 12 bmi2       : 3320   11  11   1000    33.8 %   3440   36.0 %
Elo-spreading (1st to last):174 Elo
Draws:36.1%


GM_4moves:

Code: Select all

     Program                Elo    +    -   Games   Score   Av.Op.  Draws
   1 Stockfish 10 bmi2    : 3475   11  11   1000    65.4 %  3363    53.2 %
   2 Houdini 6 pext       : 3381   10  10   1000    46.0 %  3410    59.9 %
   3 Komodo 12 bmi2       : 3345   10  10   1000    38.5 %  3428    55.9 %
Elo-spreading (1st to last):130 Elo
Draws:56.3%


Stockfish framework 8moves:

Code: Select all

     Program                Elo    +    -   Games   Score   Av.Op.  Draws
   1 Stockfish 10 bmi2    : 3463   11  11   1000    63.0 %  3369    59.7 %
   2 Houdini 6 pext       : 3388   10  10   1000    47.5 %  3406    64.2 %
   3 Komodo 12 bmi2       : 3349   10  10   1000    39.5 %  3425    60.1 %
Elo-spreading (1st to last):114 Elo
Draws:61.3%
abulmo2
Posts: 433
Joined: Fri Dec 16, 2016 11:04 am
Location: France
Full name: Richard Delorme

Re: Blackmageddon Openings released

Post by abulmo2 »

Your openings do not respect their promises for Amoeba's testing :( . I do my testing with a 8 moves opening (same as stockfish) producing about 60-65% of draw. Unfortunately I have the same draw percentages with DrawKiller_big.pgn or Blackmageddon_5mvs_10k.pgn. Usually the elo difference measured is below 10. I wonder if other people have lower draw percentages with their engines?
Richard Delorme
User avatar
pohl4711
Posts: 2435
Joined: Sat Sep 03, 2011 7:25 am
Location: Berlin, Germany
Full name: Stefan Pohl

Re: Blackmageddon Openings released

Post by pohl4711 »

abulmo2 wrote: Sun Dec 08, 2019 2:29 pm Your openings do not respect their promises for Amoeba's testing :( . I do my testing with a 8 moves opening (same as stockfish) producing about 60-65% of draw. Unfortunately I have the same draw percentages with DrawKiller_big.pgn or Blackmageddon_5mvs_10k.pgn. Usually the elo difference measured is below 10. I wonder if other people have lower draw percentages with their engines?
Sad to hear this. All I can say is, that my testings are correct. And Hauke Lutz (my partner in the development of Drawkiller) did several Alpha- and Beta-testruns, too and got the same good results, like I got. You can download all played test-games in the "Drawkiller"-section of my website.
And mention, that Drawkiller was made for high-end computerchess. Amoeba is (CEGT-rating) more than 600 Elo behind Stockfish...At this level of computerchess, I see no need to use special-openings like Drawkiller, because the draw-rates should be not that high, even though using a classical openings-set.

But, if you try Blackmageddon, you will have definitly no draws, because all draws are counted as a win for white. So each game ends with a win. For white or black. That is the main idea of this approach.
abulmo2
Posts: 433
Joined: Fri Dec 16, 2016 11:04 am
Location: France
Full name: Richard Delorme

Re: Blackmageddon Openings released

Post by abulmo2 »

pohl4711 wrote: Sun Dec 08, 2019 4:10 pm But, if you try Blackmageddon, you will have definitly no draws, because all draws are counted as a win for white. So each game ends with a win. For white or black. That is the main idea of this approach.
I did not understand this part. This change the elo model and how one should compute variances. As I am using my own testing framework I can probably adapt it to Blackmageddon.
Richard Delorme
lkaufman
Posts: 5960
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA

Re: Blackmageddon Openings released

Post by lkaufman »

How were the openings selected for this project? I noticed some points about them:

1. Many of the White openings were not ones that we would see in Elite games, lots of games with 1.f4 or an early b3 or an early d3 or even King's Gambit etc. Maybe representative of strong amateur play but not in general where White is seriously playing for an opening advantage. Probably this is why White only won narrowly in your test; I think that if White plays optimal openings for the situation, or even just normal openings from 2700+ games, he would win heavily with the Armageddon rule. Perhaps you purposely selected openings that were close to even without the missing pawn?

2. Some openings don't make sense with a2 missing. For example, you include the Ruy Lopez (Spanish) with 3...a6? 4.Ba4? but both of those moves are silly with a2 missing, since a6 is pinned.

Maybe these things aren't so important for the purpose of using it just to test engines, but I don't think that the missing a2 Armageddon game is a playable version for actual tournaments (whether human or engine) as it is too favorable for White without forcing inferior openings on White.
Komodo rules!
User avatar
Nordlandia
Posts: 2821
Joined: Fri Sep 25, 2015 9:38 pm
Location: Sortland, Norway

Re: Blackmageddon Openings released

Post by Nordlandia »

Removing the a2 pawn is hardly considered odds for normal engine games. I don't know if armageddon/blackmageddon change this situation.
User avatar
pohl4711
Posts: 2435
Joined: Sat Sep 03, 2011 7:25 am
Location: Berlin, Germany
Full name: Stefan Pohl

Re: Blackmageddon Openings released

Post by pohl4711 »

lkaufman wrote: Sun Dec 08, 2019 7:21 pm

2. Some openings don't make sense with a2 missing. For example, you include the Ruy Lopez (Spanish) with 3...a6? 4.Ba4? but both of those moves are silly with a2 missing, since a6 is pinned.

Maybe these things aren't so important for the purpose of using it just to test engines, but I don't think that the missing a2 Armageddon game is a playable version for actual tournaments (whether human or engine) as it is too favorable for White without forcing inferior openings on White.
And what do you think, how many opening-lines are nonsense, if black is not allowed to castle???
A King on e-line should make a lot more lines wrong, than a missing pawn on a2. In most openings, a castled black king is an important part of the opening.

In Blackmageddon Komodo checked all endpositions of the lines, so that all these endpositions are playable.
lkaufman
Posts: 5960
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA

Re: Blackmageddon Openings released

Post by lkaufman »

pohl4711 wrote: Sun Dec 08, 2019 7:38 pm
lkaufman wrote: Sun Dec 08, 2019 7:21 pm

2. Some openings don't make sense with a2 missing. For example, you include the Ruy Lopez (Spanish) with 3...a6? 4.Ba4? but both of those moves are silly with a2 missing, since a6 is pinned.

Maybe these things aren't so important for the purpose of using it just to test engines, but I don't think that the missing a2 Armageddon game is a playable version for actual tournaments (whether human or engine) as it is too favorable for White without forcing inferior openings on White.
And what do you think, how many opening-lines are nonsense, if black is not allowed to castle???
A King on e-line should make a lot more lines wrong, than a missing pawn on a2. In most openings, a castled black king is an important part of the opening.

In Blackmageddon Komodo checked all endpositions of the lines, so that all these endpositions are playable.
What was your definition of "playable", given that these are armageddon games? I suppose something like a -0.9 to -0.5 eval perhaps? This would explain why the lines mostly feature bad white opening play, because with good White opening play the eval will be too close to even.
Regarding no-castle, yes, the normal openings are nonsense, which is why Kai made a book based on playing with these rules, not based on normal chess openings.
Komodo rules!
User avatar
pohl4711
Posts: 2435
Joined: Sat Sep 03, 2011 7:25 am
Location: Berlin, Germany
Full name: Stefan Pohl

Re: Blackmageddon Openings released

Post by pohl4711 »

lkaufman wrote: Sun Dec 08, 2019 8:18 pm
pohl4711 wrote: Sun Dec 08, 2019 7:38 pm
lkaufman wrote: Sun Dec 08, 2019 7:21 pm

2. Some openings don't make sense with a2 missing. For example, you include the Ruy Lopez (Spanish) with 3...a6? 4.Ba4? but both of those moves are silly with a2 missing, since a6 is pinned.

Maybe these things aren't so important for the purpose of using it just to test engines, but I don't think that the missing a2 Armageddon game is a playable version for actual tournaments (whether human or engine) as it is too favorable for White without forcing inferior openings on White.
And what do you think, how many opening-lines are nonsense, if black is not allowed to castle???
A King on e-line should make a lot more lines wrong, than a missing pawn on a2. In most openings, a castled black king is an important part of the opening.

In Blackmageddon Komodo checked all endpositions of the lines, so that all these endpositions are playable.
What was your definition of "playable", given that these are armageddon games? I suppose something like a -0.9 to -0.5 eval perhaps? This would explain why the lines mostly feature bad white opening play, because with good White opening play the eval will be too close to even.
Regarding no-castle, yes, the normal openings are nonsense, which is why Kai made a book based on playing with these rules, not based on normal chess openings.
Perhaps a look in rhe pgn would help. The Komodo Eval and the Eval-limits are in the TAGs.
I analyzed more than 40000 lines. And then splitted the lines in 10 centipawn-eval Intervals from 0 to 120 (0-10, 11-20, 21-30 and so on).. And the lines, that were finally chosen, were from the middle of that and there (in the middle) were the most lines. So the chosen lines are most “normal“ lines out of the 40000. In the opinion of Komodo.
Look, how small the eval-interval of the 10k pgn file is. And still 10000 lines out of little bit more than 40000: nearly 25% of all openings in such a small interval.
User avatar
Ovyron
Posts: 4556
Joined: Tue Jul 03, 2007 4:30 am

Re: Blackmageddon Openings released

Post by Ovyron »

pohl4711 wrote: Mon Dec 09, 2019 5:23 am The Komodo Eval and the Eval-limits are in the TAGs.
Did Komodo know that a draw would count as a white win? Otherwise, those evals can't be used because black would be fine with a 0.00 because all the other moves were in white's favor, except if the 0.00 leads to a white draw wins the eval makes no sense.

I.e. as black Komodo would choose a 0.00 forced draw (with black losing) over a 0.01 move because it doesn't know it's not chess.