TalkChess.com

Posted: **Fri Nov 23, 2018 12:58 pm**

My Drawkiller Openings Project is finished.

Never before any openings-set gave such low draw-rates without crunching the scores of the engines towards 50%, but instead pushing the scores away from 50%. The Drawkiller Normal- and Tournament sets nearly halve the draw-rate, compared to FEOBOS or the Stockfish Framework 8-move openings. I would never have expected, that this was possible – the Drawkiller project is really a breakthrough into another dimension of Computerchess. Look at my testing results:

(asmFish 170426 vs. Komodo 10.4, 5'+3'' time-control, singlecore, no ponder, no endgame-bases, LittleBlitzerGUI, 1000 games each testrun(!) except Noomen Gambit-lines (only 246 positions, so 492 games were played) and Noomen TCEC Superfinal (only 100 positions, so 200 games were played))

Stockfish Framework standard 8 move openings: Score 60.3% – 39.7%, draws: 63.4%
FEOBOS v20 contempt 5 top 500 openings: Score 58.7% - 41.3%, draws: 64.1%
HERT 500 set: Score: 60.6% - 39.4%, draws: 60.4%
Noomen Gambit-Lines: Score 59.1% - 40.9%, draws: 59.3
4 GM-moves short book: Score 60.5% - 39.5%, draws: 57.1%
Noomen TCEC Superfinal (Season 9+10): Score: 62.5% - 37.5%, draws: 50.0%
SALC V5 half-closed: Score 61.6% - 38.4%, draws: 49.2%
SALC V5 full-closed 500 positions: Score 66.5% - 33.5%, draws: 47.7%

NEW:

Drawkiller (Big set): Score 63.8% - 36.2%, draws: 39.5%
Drawkiller (Normal set): Score: 65.3% - 34.7%, draws: 33.5%
Drawkiller (Tournament set): Score: 65.3% - 34.7%, draws: 33.5%

(no mistake by me: the results of the normal-set and the tournament-set were exactly the same after 1000 played games in my testruns)

Learn more about Drawkiller openings in the "Drawkiller openings"- section on my website and download them!

https://www.sp-cc.de

Posted: **Fri Nov 23, 2018 5:33 pm**

Hi Stefan!

pohl4711 wrote: ↑Fri Nov 23, 2018 12:58 pm Learn more about Drawkiller openings in the "Drawkiller openings"- section on my website and download them!

https://www.sp-cc.de

Any project, that spares amounts of games to be played for good statistics and thus spares hardware- time, is a good project to me.

One could say your opening- set was a rather special one once again, but that's not a disadvantage by itself to me, cause I see any opening test set as a special one anyhow.

I see no principal difference between an opening test set for rating matches and played out positional testing of other kind at all anyhow. Playing out certain positions is one way of positional testing, so your test set simply fulfills the same aims as other opening test sets and books do, differences always are made by the single positions and the numbers of them only.

Yet reducing amounts of games to be played for comparable and reproducible statistical significance is a fine achievement, if it works as you describe, and your results seem to prove so.

Congrats and thanks for the big und good work.

Posted: **Fri Nov 23, 2018 6:17 pm**

Thanx.

I am convinced, that everybody, who plays a number of games, which is statistically big enough, with Drawkiller openings, will get very, very low draw-rates, too.

Posted: **Sat Nov 24, 2018 4:46 am**

pohl4711 wrote: ↑Fri Nov 23, 2018 12:58 pm My Drawkiller Openings Project is finished.

Never before any openings-set gave such low draw-rates without crunching the scores of the engines towards 50%, but instead pushing the scores away from 50%. The Drawkiller Normal- and Tournament sets nearly halve the draw-rate, compared to FEOBOS or the Stockfish Framework 8-move openings. I would never have expected, that this was possible – the Drawkiller project is really a breakthrough into another dimension of Computerchess. Look at my testing results:

(asmFish 170426 vs. Komodo 10.4, 5'+3'' time-control, singlecore, no ponder, no endgame-bases, LittleBlitzerGUI, 1000 games each testrun(!) except Noomen Gambit-lines (only 246 positions, so 492 games were played) and Noomen TCEC Superfinal (only 100 positions, so 200 games were played))

Stockfish Framework standard 8 move openings: Score 60.3% – 39.7%, draws: 63.4%
FEOBOS v20 contempt 5 top 500 openings: Score 58.7% - 41.3%, draws: 64.1%
HERT 500 set: Score: 60.6% - 39.4%, draws: 60.4%
Noomen Gambit-Lines: Score 59.1% - 40.9%, draws: 59.3
4 GM-moves short book: Score 60.5% - 39.5%, draws: 57.1%
Noomen TCEC Superfinal (Season 9+10): Score: 62.5% - 37.5%, draws: 50.0%
SALC V5 half-closed: Score 61.6% - 38.4%, draws: 49.2%
SALC V5 full-closed 500 positions: Score 66.5% - 33.5%, draws: 47.7%

NEW:

Drawkiller (Big set): Score 63.8% - 36.2%, draws: 39.5%
Drawkiller (Normal set): Score: 65.3% - 34.7%, draws: 33.5%
Drawkiller (Tournament set): Score: 65.3% - 34.7%, draws: 33.5%

(no mistake by me: the results of the normal-set and the tournament-set were exactly the same after 1000 played games in my testruns)

Learn more about Drawkiller openings in the "Drawkiller openings"- section on my website and download them!

https://www.sp-cc.de

Interesting. Opening sets that maximize the amount of information in this way are precious for engine development.

I use this one for developing Demolito (which I generated myself):
https://github.com/zamar/spsa/blob/master/book.epd

Would you mind giving it a spin to see how it compares with yours ?

PS: For engine dev you need big sets (ie. 30k at least), and you also need to verify recombination risk (how frequently do games from opening X transpose into games from opening Y, this reduces information and you won't notice it by just measuring scores and draw rates).

Posted: **Sat Nov 24, 2018 5:13 am**

lucasart wrote: ↑Sat Nov 24, 2018 4:46 am I use this one for developing Demolito (which I generated myself):
https://github.com/zamar/spsa/blob/master/book.epd

Would you mind giving it a spin to see how it compares with yours ?

PS: For engine dev you need big sets (ie. 30k at least), and you also need to verify recombination risk (how frequently do games from opening X transpose into games from opening Y, this reduces information and you won't notice it by just measuring scores and draw rates).

I can start a 1000 games testrun with your EPD in some days (will run 4 days). Could you give some more information about the positions in your file?! I need especially information, if the positions in the EPD are sorted or mixed by random.

The Drawkiller Big file has nearly 30000 different lines/endpositions, and I recommend it for engine-developers (everybody else should use the smaller sets (normal, tournament), which give lower draw-rate).

Regards - Stefan (SPCC)

Posted: **Sat Nov 24, 2018 9:16 am**

pohl4711 wrote: ↑Sat Nov 24, 2018 5:13 am
lucasart wrote: ↑Sat Nov 24, 2018 4:46 am I use this one for developing Demolito (which I generated myself):
https://github.com/zamar/spsa/blob/master/book.epd

Would you mind giving it a spin to see how it compares with yours ?

PS: For engine dev you need big sets (ie. 30k at least), and you also need to verify recombination risk (how frequently do games from opening X transpose into games from opening Y, this reduces information and you won't notice it by just measuring scores and draw rates).
I can start a 1000 games testrun with your EPD in some days (will run 4 days). Could you give some more information about the positions in your file?! I need especially information, if the positions in the EPD are sorted or mixed by random.

The Drawkiller Big file has nearly 30000 different lines/endpositions, and I recommend it for engine-developers (everybody else should use the smaller sets (normal, tournament), which give lower draw-rate).

Regards - Stefan (SPCC)

It was generated as follows:

Round(X%)

generate all legal moves, and retain the positions
randomly weed out X% of positions
filter the bad ones: play Critter 1.6 vs. itself quick blitz game. If the Critter scores stays within bounds +/- 0.5 for 5 moves (10 plies)

From the starting position, apply recursively:

Round(0%) x 2
Round(75%) x 3

This results in 5 ply positions (2.5 moves), which have nice properties:

fairness: no position should be clearly favoring either side. At least good enough for bullet games used in engine dev. There may be some shenanigans revealed by deeper search, but I don't care for engine dev (ie. 4"+0.04" games where tens of thousands of games are played to test each patch, and it's all about statistics, not chess).
coverage: due to the random nature of the positions, we get a very wide shallow tree, which means we explore much more than the usual chess opening theory. we force the engines to go into deep woods, which is good to avoid the over fitting problem.
low draw rate. But this is not done by cheating like TCEC does. I don't allow largely biaised positions where one side has a practically won game out of the opening. This is achieved by keeping the book extremely shallow and forcing engines to explore deep woods of unknown openings.
good information: recombination risk is kept to a minimum, thanks to the 3 rounds of 25% random selection.

Posted: **Sat Nov 24, 2018 9:41 am**

Stefan, when you released the SALC books in January I tried to draw your attention to short draws I observed in practical 40/40 tests as well as a heavily biased line that appears to be a win for White straight out of the book. You chose to pick on an unrelated technical inaccuracy in one of my posts and ignored the reports.

I understand that this latest project builds on the base of SALC. Has anything been done in the intervening months to address the above issues and weed out all such subpar lines?

Posted: **Sat Nov 24, 2018 11:16 am**

I've done nearly the same more than 10 years ago.
Not for an openingbook, but for starting positions.
After a few days i've cancelled the tests because
i got way to much similar ECO-Codes.
Whats your spread of the ECO-Codes in test-matches ?

Best wishes,
G.S.
(CEGT team)

Posted: **Sat Nov 24, 2018 11:22 am**

lucasart wrote: ↑Sat Nov 24, 2018 9:16 am
It was generated as follows:

Round(X%)

generate all legal moves, and retain the positions

randomly weed out X% of positions

filter the bad ones: play Critter 1.6 vs. itself quick blitz game. If the Critter scores stays within bounds +/- 0.5 for 5 moves (10 plies)

From the starting position, apply recursively:

Round(0%) x 2

Round(75%) x 3

This results in 5 ply positions (2.5 moves), which have nice properties:

fairness: no position should be clearly favoring either side. At least good enough for bullet games used in engine dev. There may be some shenanigans revealed by deeper search, but I don't care for engine dev (ie. 4"+0.04" games where tens of thousands of games are played to test each patch, and it's all about statistics, not chess).

coverage: due to the random nature of the positions, we get a very wide shallow tree, which means we explore much more than the usual chess opening theory. we force the engines to go into deep woods, which is good to avoid the over fitting problem.

low draw rate. But this is not done by cheating like TCEC does. I don't allow largely biaised positions where one side has a practically won game out of the opening. This is achieved by keeping the book extremely shallow and forcing engines to explore deep woods of unknown openings.

good information: recombination risk is kept to a minimum, thanks to the 3 rounds of 25% random selection.

Interesting approach! I started the testrun right now. Will take around 4 days. I will report the results here.

Posted: **Sat Nov 24, 2018 11:40 am**

tpoppins wrote: ↑Sat Nov 24, 2018 9:41 am Stefan, when you released the SALC books in January I tried to draw your attention to short draws I observed in practical 40/40 tests as well as a heavily biased line that appears to be a win for White straight out of the book. You chose to pick on an unrelated technical inaccuracy in one of my posts and ignored the reports.

I understand that this latest project builds on the base of SALC. Has anything been done in the intervening months to address the above issues and weed out all such subpar lines?

I never doubted, that there could be some short draws and some short wins in testruns, using the SALC-openings. The point is: some. not many, otherwise the testing-results in my 1000 games testruns would not have been so good. And the other point: these positions can not be filtered out, because if you use other engines or other hardware or other timecontrols, these positions could lead to normal games and other SALC-positions could lead to short draws or short wins. It is impossible to avoid that. So I did not try it.

The Drawkiller openings are a complete restart. SALC was built out of human games filtered out of the BigDatabase. Drawkiller is built out of some pawn-plies openings, combined with the drawkiller-lines, which travel the kings to the edge of the chessboard. Only the idea of SALC was taken from SALC to Drawkiller.
In SALC all endpositions were calculatd by Komodo and only positions with a Komodo-eval in an interval [-0.6;+0.6] were allowed.
For Drawkiller the Komodo-eval interval was shrinked to [-0.49,+0.49] (and [-0.39;+0.39] for Drawkiller tournament). And in the Drawkiller openings, all non-pawn-pieces are still on the baseline (1 and 8), so the positions are very close to the normal starting position of chess. So, it is not very likely, that short wins can happen. But i have no doubt, that they can happen. Short draws can happen, too. But it is not possible to filter these positions out (see above).
It is important to understand: My opening-files are huge (10000, 20000, 30000 endpositions). What happens in some of them of these endpositions is without meaning. What happens in most of them is important! And that is checked out by my 1000-games testruns. What happens in 1,2,3 or 5 games - who cares? That is statistical noise, nothing else.

TalkChess.com

Drawkiller Openings Project

Drawkiller Openings Project

Re: Drawkiller Openings Project

Re: Drawkiller Openings Project

Re: Drawkiller Openings Project

Re: Drawkiller Openings Project

Re: Drawkiller Openings Project

Re: Drawkiller Openings Project

Re: Drawkiller Openings Project

Re: Drawkiller Openings Project

Re: Drawkiller Openings Project