HERT - brand new openings-set by Thomas Zipproth
Posted: Mon Aug 14, 2017 10:55 am
My new gamebase for Stockfish-testruns is now complete. And all is new: +30% faster CPU (i7-6700HQ, Intel Skylake, 2.6GHz, longer thinkingtime (180“+1000ms (= 3'+1“) instead of 70“+700ms, 4x bigger Hash (512 MB per engine). And a complete new openings test-set is used. Thomas Zipproth created a new 500 positions openings set: The HERT chess engine test set (“H“uman and “E“ngine “R“elevant “T“heory.). Because of his stellar work on the Cerebellum-Library (Brainfish (www.zipproth.de)), he is definitly one of the best experts for enginechess openings on the planet.
It contains openings, played in high-level engine (online-)chess and by human GMs. So, it will not contain all ECO-codes (like FEOBOS-project for example), but only those openings, strong humans and engines in tournaments really play on the chessboard. So, this testset has probably the highest practical relevance an openings-set ever had and it has a very low draw-rate (higher, than SALC-openings, of course, but measureable lower, than other "normal" openings-sets).
Because of this, it is strongly recommended to use it for engine testing and for engine development.
You can download the HERT set on my website:
http://spcc.beepworld.de/downloads--links.htm
Here further information about the HERT testset, taken out of the ReadMe-File:
500 test positions selected from the most played variations in Engine and Human tournaments.
What properties should a good test set have?
1.) It should test all aspects of an engine, so it must contain openings leading to different positional and tactical problems.
2.) For a good analysis of humans games, some variations mostly played by humans like special gambits should also be contained.
3.) The positions should not be too drawish with todays best engines
4.) The games should not contain too much transitions, so that most games are unique.
5.) The positions should reflect what is actually played most in Engine and Human Tournaments.
The last point is the most important and unique point of Hert. Many test sets contain positions which appeared less than 5 times or even never in any Engine or Human tournament.
The idea of Hert is that a position which was played several thousand times in Engine or Humans tournaments, should have some desirable properties by default.
For example it is unlikely that such a position is extremely drawish or nearly lost, otherwise it would not be played so often.
Of course some exceptions where engines tend to make a fast draw always have to be excluded.
Additionally the Hert set tries to reproduce the importance of all openings up to some degree. That means, when a special opening like Giuoco Piano appeared over 300.000 times in Engine games, it was split up into several variations with positions which appeared around 5000 - 20000 times.
So finally the Hert Set tests the ability of Engines to play all kinds of variations which are commonly appearing in Engine and Human Tournaments.
The openings were mixed (not sorted by ECO-code), so it is possible to use only a part of the Hert Set without distorted results. Using the full Hert Set (500 positions) means 1000 games in an engine head-to-head competiton...But keep in mind, that the lower the number of played games is, the higher the Errorbar of testing-results gets!
No line of the Hert Set includes an En-Passant move. So it is possible to use the Hert.PGN in the LittleBlitzerGUI (which has an En-Passant Bug (the captured pawn is not removed from the chessboard, when an En-Passant move is in the opening-line
PGN)).
All work on the Hert Set was done by Thomas Zipproth.
Tests and documentation: Thomas Zipproth & Stefan Pohl
It contains openings, played in high-level engine (online-)chess and by human GMs. So, it will not contain all ECO-codes (like FEOBOS-project for example), but only those openings, strong humans and engines in tournaments really play on the chessboard. So, this testset has probably the highest practical relevance an openings-set ever had and it has a very low draw-rate (higher, than SALC-openings, of course, but measureable lower, than other "normal" openings-sets).
Because of this, it is strongly recommended to use it for engine testing and for engine development.
You can download the HERT set on my website:
http://spcc.beepworld.de/downloads--links.htm
Here further information about the HERT testset, taken out of the ReadMe-File:
500 test positions selected from the most played variations in Engine and Human tournaments.
What properties should a good test set have?
1.) It should test all aspects of an engine, so it must contain openings leading to different positional and tactical problems.
2.) For a good analysis of humans games, some variations mostly played by humans like special gambits should also be contained.
3.) The positions should not be too drawish with todays best engines
4.) The games should not contain too much transitions, so that most games are unique.
5.) The positions should reflect what is actually played most in Engine and Human Tournaments.
The last point is the most important and unique point of Hert. Many test sets contain positions which appeared less than 5 times or even never in any Engine or Human tournament.
The idea of Hert is that a position which was played several thousand times in Engine or Humans tournaments, should have some desirable properties by default.
For example it is unlikely that such a position is extremely drawish or nearly lost, otherwise it would not be played so often.
Of course some exceptions where engines tend to make a fast draw always have to be excluded.
Additionally the Hert set tries to reproduce the importance of all openings up to some degree. That means, when a special opening like Giuoco Piano appeared over 300.000 times in Engine games, it was split up into several variations with positions which appeared around 5000 - 20000 times.
So finally the Hert Set tests the ability of Engines to play all kinds of variations which are commonly appearing in Engine and Human Tournaments.
The openings were mixed (not sorted by ECO-code), so it is possible to use only a part of the Hert Set without distorted results. Using the full Hert Set (500 positions) means 1000 games in an engine head-to-head competiton...But keep in mind, that the lower the number of played games is, the higher the Errorbar of testing-results gets!
No line of the Hert Set includes an En-Passant move. So it is possible to use the Hert.PGN in the LittleBlitzerGUI (which has an En-Passant Bug (the captured pawn is not removed from the chessboard, when an En-Passant move is in the opening-line
PGN)).
All work on the Hert Set was done by Thomas Zipproth.
Tests and documentation: Thomas Zipproth & Stefan Pohl