In response to the KID thread.

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

In response to the KID thread.

Post by Laskos »

This recent long thread here started by Om Prakash http://www.talkchess.com/forum/viewtopic.php?t=64214 is asking explicitly "The Best KID Engine ??". Then some peculiar KID positions are analyzed by our chess or engine specialists. I especially like Lyudmil Tsvetkov comments, as he seems a very accomplished chess player, often surpassing all the engines in analysis, at least he claims so. Well, this approach, for a patzer like me, seems unsatisfactory. The question "The Best KID Engine ??" seems in limbo to me. I, as a patzer in chess, took a different approach.

Dann Corbit provided me with an excellent EPD file of various KID opening positions, 1245 of them. KID plans are 15-20 moves long. I used peculiar adjudication conditions:
1/ When both engines agree to 70cp advantage for one side in 2 consecutive moves, game is adjudicated as Win
2/ When engines play 20 moves from the opening without a Win, game is adjudicated as Draw

The purpose of these adjudication conditions is to evidence the serious advantage one gets from KID openings, without bothering how they convert it (that is already not related to KID openings).

These conditions I can use with LittleBlitzer, but not with Cutechess-Cli. Cutechess has broken adjudication implementation, which works reasonably in normal conditions, but not in thes peculiar ones. I used top 7 engines at 1 second per move.

The LittleBlitzer output was the following:

Code: Select all

Games Completed = 12600 of 12600 (Avg game length = 37.623 sec)
Settings = RR/32MB/1000ms per move/M 70cp for 2 moves, D 20 moves/EPD:C:\LittleBlitzer\greasy_kid_stuff.epd(1245)
Time = 63237 sec elapsed, 0 sec remaining
 1.  Komodo 11.01 64-bit      	2062.0/3600	861-337-2402  	(L: m=0 t=0 i=0 a=337)	(D: r=6 i=0 f=0 s=0 a=2396)	(tpm=1009.9 d=16.44 nps=992868)
 2.  Stockfish 260517 64 BMI2 	1945.5/3600	877-586-2137  	(L: m=0 t=0 i=0 a=586)	(D: r=12 i=0 f=0 s=0 a=2125)	(tpm=1010.2 d=17.11 nps=1109620)
 3.  Houdini 5.01 Pro x64-popc	1802.5/3600	783-778-2039  	(L: m=0 t=0 i=0 a=778)	(D: r=15 i=0 f=0 s=0 a=2024)	(tpm=1009.0 d=15.09 nps=1417130)
 4.  Deep Shredder 13 x64     	1720.0/3600	564-724-2312  	(L: m=0 t=0 i=0 a=724)	(D: r=18 i=0 f=0 s=0 a=2294)	(tpm=1013.8 d=17.31 nps=1173046)
 5.  Gull 3 x64               	1682.0/3600	576-812-2212  	(L: m=0 t=0 i=0 a=812)	(D: r=12 i=0 f=0 s=0 a=2200)	(tpm=1012.3 d=14.68 nps=1584731)
 6.  Andscacs 0.91b           	1682.0/3600	385-621-2594  	(L: m=1 t=0 i=0 a=620)	(D: r=16 i=0 f=0 s=0 a=2578)	(tpm=1021.2 d=15.56 nps=839012)
 7.  Fritz 15                 	1706.0/3600	478-666-2456  	(L: m=0 t=0 i=0 a=666)	(D: r=13 i=0 f=0 s=0 a=2443)	(tpm=985.5 d=14.08 nps=524156)

I then performed rating calculations. ELO is a bit irrelevant, it depends too much on Draw rate, and the rate of Draws, which is subjected to adjudication conditions, is not that relevant here. I used WILO and Normalized ELO, which not only here, but in general, are better indication of strength.


WILO:

Code: Select all

   # PLAYER                       : RATING  ERROR    POINTS  PLAYED     (%)   CFS(next)
   1 Komodo 11.01 64-bit          : 3140.6   21.0     861.0    1198    71.9     100    
   2 Stockfish 260517 64 BMI2     : 3068.2   18.9     877.0    1463    59.9     100    
   3 Houdini 5.01 Pro x64-popc    : 3010.9   18.1     783.0    1561    50.2     100    
   4 Deep Shredder 13 x64         : 2960.0   19.8     564.0    1288    43.8      84    
   5 Fritz 15                     : 2944.3   20.3     478.0    1144    41.8      54    
   6 Gull 3 x64                   : 2942.6   18.3     576.0    1388    41.5      72    
   7 Andscacs 0.91b               : 2933.4   22.4     385.0    1006    38.3     ---


Normalized ELO:

Code: Select all

   # PLAYER                       : RATING  ERROR    POINTS  PLAYED
   1 Komodo 11.01 64-bit          :  0.261  0.033    2062.0    3600    
   2 Stockfish 260517 64 BMI2     :  0.128  0.033    1945.5    3600    
   3 Houdini 5.01 Pro x64-popc    :  0.002  0.033    1802.5    3600    
   4 Deep Shredder 13 x64         : -0.075  0.033    1720.0    3600    
   5 Fritz 15                     : -0.093  0.033    1706.0    3600    
   6 Gull 3 x64                   : -0.106  0.033    1682.0    3600    
   7 Andscacs 0.91b               : -0.125  0.033    1682.0    3600



Komodo 11.01 came as a clear winner in this KID contest, although in normal conditions and general openings it is significantly inferior to Stockfish dev at this time control.
For me as a patzer, this result is more convincing than the result of the thread started by Om Prakash question.
Lyudmil Tsvetkov
Posts: 6052
Joined: Tue Jun 12, 2012 12:41 pm

Re: In response to the KID thread.

Post by Lyudmil Tsvetkov »

Laskos wrote:This recent long thread here started by Om Prakash http://www.talkchess.com/forum/viewtopic.php?t=64214 is asking explicitly "The Best KID Engine ??". Then some peculiar KID positions are analyzed by our chess or engine specialists. I especially like Lyudmil Tsvetkov comments, as he seems a very accomplished chess player, often surpassing all the engines in analysis, at least he claims so. Well, this approach, for a patzer like me, seems unsatisfactory. The question "The Best KID Engine ??" seems in limbo to me. I, as a patzer in chess, took a different approach.

Dann Corbit provided me with an excellent EPD file of various KID opening positions, 1245 of them. KID plans are 15-20 moves long. I used peculiar adjudication conditions:
1/ When both engines agree to 70cp advantage for one side in 2 consecutive moves, game is adjudicated as Win
2/ When engines play 20 moves from the opening without a Win, game is adjudicated as Draw

The purpose of these adjudication conditions is to evidence the serious advantage one gets from KID openings, without bothering how they convert it (that is already not related to KID openings).

These conditions I can use with LittleBlitzer, but not with Cutechess-Cli. Cutechess has broken adjudication implementation, which works reasonably in normal conditions, but not in thes peculiar ones. I used top 7 engines at 1 second per move.

The LittleBlitzer output was the following:

Code: Select all

Games Completed = 12600 of 12600 (Avg game length = 37.623 sec)
Settings = RR/32MB/1000ms per move/M 70cp for 2 moves, D 20 moves/EPD:C:\LittleBlitzer\greasy_kid_stuff.epd(1245)
Time = 63237 sec elapsed, 0 sec remaining
 1.  Komodo 11.01 64-bit      	2062.0/3600	861-337-2402  	(L: m=0 t=0 i=0 a=337)	(D: r=6 i=0 f=0 s=0 a=2396)	(tpm=1009.9 d=16.44 nps=992868)
 2.  Stockfish 260517 64 BMI2 	1945.5/3600	877-586-2137  	(L: m=0 t=0 i=0 a=586)	(D: r=12 i=0 f=0 s=0 a=2125)	(tpm=1010.2 d=17.11 nps=1109620)
 3.  Houdini 5.01 Pro x64-popc	1802.5/3600	783-778-2039  	(L: m=0 t=0 i=0 a=778)	(D: r=15 i=0 f=0 s=0 a=2024)	(tpm=1009.0 d=15.09 nps=1417130)
 4.  Deep Shredder 13 x64     	1720.0/3600	564-724-2312  	(L: m=0 t=0 i=0 a=724)	(D: r=18 i=0 f=0 s=0 a=2294)	(tpm=1013.8 d=17.31 nps=1173046)
 5.  Gull 3 x64               	1682.0/3600	576-812-2212  	(L: m=0 t=0 i=0 a=812)	(D: r=12 i=0 f=0 s=0 a=2200)	(tpm=1012.3 d=14.68 nps=1584731)
 6.  Andscacs 0.91b           	1682.0/3600	385-621-2594  	(L: m=1 t=0 i=0 a=620)	(D: r=16 i=0 f=0 s=0 a=2578)	(tpm=1021.2 d=15.56 nps=839012)
 7.  Fritz 15                 	1706.0/3600	478-666-2456  	(L: m=0 t=0 i=0 a=666)	(D: r=13 i=0 f=0 s=0 a=2443)	(tpm=985.5 d=14.08 nps=524156)

I then performed rating calculations. ELO is a bit irrelevant, it depends too much on Draw rate, and the rate of Draws, which is subjected to adjudication conditions, is not that relevant here. I used WILO and Normalized ELO, which not only here, but in general, are better indication of strength.


WILO:

Code: Select all

   # PLAYER                       : RATING  ERROR    POINTS  PLAYED     (%)   CFS(next)
   1 Komodo 11.01 64-bit          : 3140.6   21.0     861.0    1198    71.9     100    
   2 Stockfish 260517 64 BMI2     : 3068.2   18.9     877.0    1463    59.9     100    
   3 Houdini 5.01 Pro x64-popc    : 3010.9   18.1     783.0    1561    50.2     100    
   4 Deep Shredder 13 x64         : 2960.0   19.8     564.0    1288    43.8      84    
   5 Fritz 15                     : 2944.3   20.3     478.0    1144    41.8      54    
   6 Gull 3 x64                   : 2942.6   18.3     576.0    1388    41.5      72    
   7 Andscacs 0.91b               : 2933.4   22.4     385.0    1006    38.3     ---


Normalized ELO:

Code: Select all

   # PLAYER                       : RATING  ERROR    POINTS  PLAYED
   1 Komodo 11.01 64-bit          :  0.261  0.033    2062.0    3600    
   2 Stockfish 260517 64 BMI2     :  0.128  0.033    1945.5    3600    
   3 Houdini 5.01 Pro x64-popc    :  0.002  0.033    1802.5    3600    
   4 Deep Shredder 13 x64         : -0.075  0.033    1720.0    3600    
   5 Fritz 15                     : -0.093  0.033    1706.0    3600    
   6 Gull 3 x64                   : -0.106  0.033    1682.0    3600    
   7 Andscacs 0.91b               : -0.125  0.033    1682.0    3600



Komodo 11.01 came as a clear winner in this KID contest, although in normal conditions and general openings it is significantly inferior to Stockfish dev at this time control.
For me as a patzer, this result is more convincing than the result of the thread started by Om Prakash question.
it depends on how you break down KID openings.
when I was saying SF and Houdini play the KID better than Komodo, I was referring to the mainline, human, Kasparovian fully closed game, with a twice defended storming pawn on f5/f4, c5/c4.

SF is playing that better than Komodo, because it has somewhat better connected pawns eval, which is suitable to fully closed KIDs, and somewhat better king safety.

if, however, the epd contains a large set of KID positions, including all possible KID lines (what was it E66 to E99 or something), among which there will be a substantial portion of lines exhibiting a totally different character than fully closed KIDs, then Komodo might very well be on top.

it is very difficult to generalise, there are positions which specific engine plays better, and others it plays worse. if the set has been favouring Komodo overall, then it will come on top, of course.

I wonder how did you manage to find so many KID lines, how long were they, possible to post one or 2?

not that it matters very much, as conditions were same for all, but why that 70cps restriction.
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: In response to the KID thread.

Post by Laskos »

Lyudmil Tsvetkov wrote:
it depends on how you break down KID openings.
when I was saying SF and Houdini play the KID better than Komodo, I was referring to the mainline, human, Kasparovian fully closed game, with a twice defended storming pawn on f5/f4, c5/c4.

SF is playing that better than Komodo, because it has somewhat better connected pawns eval, which is suitable to fully closed KIDs, and somewhat better king safety.

if, however, the epd contains a large set of KID positions, including all possible KID lines (what was it E66 to E99 or something), among which there will be a substantial portion of lines exhibiting a totally different character than fully closed KIDs, then Komodo might very well be on top.

it is very difficult to generalise, there are positions which specific engine plays better, and others it plays worse. if the set has been favouring Komodo overall, then it will come on top, of course.

I wonder how did you manage to find so many KID lines, how long were they, possible to post one or 2?

not that it matters very much, as conditions were same for all, but why that 70cps restriction.
The question of Om Prakash was general. And for general KID positions Dann Corbit provided here a very useful EPD file:

Greasy Kid Stuff:

http://rybkaforum.net/cgi-bin/rybkaforu ... pid=572459

70cp is serious advantage right from the opening. Much higher, and I will get mostly Draws by adjudication (20 moves), much lower, the advantage is too mild to be conclusive (besides that, many openings have 30-40cp advantages to start with).
Lyudmil Tsvetkov
Posts: 6052
Joined: Tue Jun 12, 2012 12:41 pm

Re: In response to the KID thread.

Post by Lyudmil Tsvetkov »

Laskos wrote:
Lyudmil Tsvetkov wrote:
it depends on how you break down KID openings.
when I was saying SF and Houdini play the KID better than Komodo, I was referring to the mainline, human, Kasparovian fully closed game, with a twice defended storming pawn on f5/f4, c5/c4.

SF is playing that better than Komodo, because it has somewhat better connected pawns eval, which is suitable to fully closed KIDs, and somewhat better king safety.

if, however, the epd contains a large set of KID positions, including all possible KID lines (what was it E66 to E99 or something), among which there will be a substantial portion of lines exhibiting a totally different character than fully closed KIDs, then Komodo might very well be on top.

it is very difficult to generalise, there are positions which specific engine plays better, and others it plays worse. if the set has been favouring Komodo overall, then it will come on top, of course.

I wonder how did you manage to find so many KID lines, how long were they, possible to post one or 2?

not that it matters very much, as conditions were same for all, but why that 70cps restriction.
The question of Om Prakash was general. And for general KID positions Dann Corbit provided here a very useful EPD file:

Greasy Kid Stuff:

http://rybkaforum.net/cgi-bin/rybkaforu ... pid=572459

70cp is serious advantage right from the opening. Much higher, and I will get mostly Draws by adjudication (20 moves), much lower, the advantage is too mild to be conclusive (besides that, many openings have 30-40cp advantages to start with).
the question might have been general, but he posted a position, where the best line of play involves fully closing the game.
besides, humans generally refer to the KID as that black line with twice defended pawn on f4 and fully closed game. that is what captures their imagination.

as I assumed, the epd set is so wide, that it includes absolutely any opening, maybe excluding the Sicilian.
why not just E60-E90 code games there?

for example, while quickly browsing the set, I find the following:

[d]rnbqk2r/pp2ppbp/6p1/2p5/3PP3/2P2N2/P4PPP/R1BQKB1R w KQkq - 0 1

that seems to be a Gruenfeld, an open game, of very different nature

[d]rnbqkbnr/pppp1ppp/4p3/8/4P3/3P4/PPP2PPP/RNBQKBNR b KQkq - 0 1

another one. in what way is this a KID? obviously, this is a French, an open game

[d]rnbqk2r/ppp1ppbp/5np1/8/2QP4/2N2N2/PP2PPPP/R1B1KB1R b KQkq - 0 1

another Gruenfeld, open game

[d]rnbqk2r/ppppppbp/5np1/8/3P1B2/4PN2/PPP2PPP/RN1QKB1R b KQkq - 0 1

and this one is neither KID, nor Gruenfeld, just a random Queen's pawn opening

[d]rnbqkb1r/ppp2ppp/4pn2/3p4/4P3/3P1N2/PPPN1PPP/R1BQKB1R b KQkq - 0 1

another French

[d]r1bq1rk1/pppn1pbp/3p1np1/8/2PpP3/2N2NP1/PP3PBP/R1BQ1RK1 w - - 0 1

open KID

[d]r2qkb1r/pp1n1ppp/2p2n2/3pp3/4P1b1/3P1NP1/PPPN1PBP/R1BQ1RK1 b kq - 0 1

I do not know what this one is, Caro-Kann?, certainly not a KID

[d]rnbq1rk1/ppp1ppbp/3p1np1/8/2PP4/1P3NP1/PB2PP1P/RN1QKB1R b KQ - 0 1

some kind of a flank opening

etc., etc.

at a quick browse, there are many general KIDs there, but also many Gruenfelds, many French, many A code openings, many unspecified, etc.

closed KIDs are only about 2% of all positions, at a quick browse, but it is true another 20% could be closed by good engine play.

so that, when running a test suite, one should first try to establish what one is running.

otherwise, thanks for the test. :)
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: In response to the KID thread.

Post by Laskos »

Lyudmil Tsvetkov wrote:
Laskos wrote:
Lyudmil Tsvetkov wrote:
it depends on how you break down KID openings.
when I was saying SF and Houdini play the KID better than Komodo, I was referring to the mainline, human, Kasparovian fully closed game, with a twice defended storming pawn on f5/f4, c5/c4.

SF is playing that better than Komodo, because it has somewhat better connected pawns eval, which is suitable to fully closed KIDs, and somewhat better king safety.

if, however, the epd contains a large set of KID positions, including all possible KID lines (what was it E66 to E99 or something), among which there will be a substantial portion of lines exhibiting a totally different character than fully closed KIDs, then Komodo might very well be on top.

it is very difficult to generalise, there are positions which specific engine plays better, and others it plays worse. if the set has been favouring Komodo overall, then it will come on top, of course.

I wonder how did you manage to find so many KID lines, how long were they, possible to post one or 2?

not that it matters very much, as conditions were same for all, but why that 70cps restriction.
The question of Om Prakash was general. And for general KID positions Dann Corbit provided here a very useful EPD file:

Greasy Kid Stuff:

http://rybkaforum.net/cgi-bin/rybkaforu ... pid=572459

70cp is serious advantage right from the opening. Much higher, and I will get mostly Draws by adjudication (20 moves), much lower, the advantage is too mild to be conclusive (besides that, many openings have 30-40cp advantages to start with).
the question might have been general, but he posted a position, where the best line of play involves fully closing the game.
besides, humans generally refer to the KID as that black line with twice defended pawn on f4 and fully closed game. that is what captures their imagination.

as I assumed, the epd set is so wide, that it includes absolutely any opening, maybe excluding the Sicilian.
why not just E60-E90 code games there?

for example, while quickly browsing the set, I find the following:

[d]rnbqk2r/pp2ppbp/6p1/2p5/3PP3/2P2N2/P4PPP/R1BQKB1R w KQkq - 0 1

that seems to be a Gruenfeld, an open game, of very different nature

[d]rnbqkbnr/pppp1ppp/4p3/8/4P3/3P4/PPP2PPP/RNBQKBNR b KQkq - 0 1

another one. in what way is this a KID? obviously, this is a French, an open game

[d]rnbqk2r/ppp1ppbp/5np1/8/2QP4/2N2N2/PP2PPPP/R1B1KB1R b KQkq - 0 1

another Gruenfeld, open game

[d]rnbqk2r/ppppppbp/5np1/8/3P1B2/4PN2/PPP2PPP/RN1QKB1R b KQkq - 0 1

and this one is neither KID, nor Gruenfeld, just a random Queen's pawn opening

[d]rnbqkb1r/ppp2ppp/4pn2/3p4/4P3/3P1N2/PPPN1PPP/R1BQKB1R b KQkq - 0 1

another French

[d]r1bq1rk1/pppn1pbp/3p1np1/8/2PpP3/2N2NP1/PP3PBP/R1BQ1RK1 w - - 0 1

open KID

[d]r2qkb1r/pp1n1ppp/2p2n2/3pp3/4P1b1/3P1NP1/PPPN1PBP/R1BQ1RK1 b kq - 0 1

I do not know what this one is, Caro-Kann?, certainly not a KID

[d]rnbq1rk1/ppp1ppbp/3p1np1/8/2PP4/1P3NP1/PB2PP1P/RN1QKB1R b KQ - 0 1

some kind of a flank opening

etc., etc.

at a quick browse, there are many general KIDs there, but also many Gruenfelds, many French, many A code openings, many unspecified, etc.

closed KIDs are only about 2% of all positions, at a quick browse, but it is true another 20% could be closed by good engine play.

so that, when running a test suite, one should first try to establish what one is running.

otherwise, thanks for the test. :)
Most of 1245 openings provided by Dann Corbit in "greasy_KID_stuff" fall into general KID category of lines of varied depth. That you can pick several more dubious is no argument here. If you want a general opening file, I picked 8moves_GM.pgn file, and performed earlier the same test with these peculiar adjudication conditions. Stockfish, as expected, came clearly the best even in these adjudication conditions:

WILO:

Code: Select all

   # PLAYER                       : RATING  ERROR    POINTS  PLAYED     (%)   CFS(next)
   1 Stockfish 260517 64 BMI2     : 3130.2   27.6     495.0     704    70.3      97    
   2 Komodo 11.01 64-bit          : 3089.4   28.6     322.0     525    61.3     100    
   3 Houdini 5.01 Pro x64-popc    : 3032.6   24.7     405.0     741    54.7     100    
   4 Gull 3 x64                   : 2948.5   28.9     223.0     571    39.1      55    
   5 Deep Shredder 13 x64         : 2945.6   29.3     224.0     543    41.3      75    
   6 Fritz 15                     : 2929.8   31.7     185.0     491    37.7      60    
   7 Andscacs 0.91b               : 2923.8   30.0     187.0     507    36.9     ---    
Normalized ELO:

Code: Select all

   # PLAYER                       : RATING  ERROR    POINTS  PLAYED
   1 Stockfish 260517 64 BMI2     :  0.183  0.033    1943.0    3600  
   2 Komodo 11.01 64-bit          :  0.080  0.033    1859.5    3600      
   3 Houdini 5.01 Pro x64-popc    :  0.042  0.033    1834.5    3600    
   4 Deep Shredder 13 x64         : -0.068  0.033    1752.5    3600    
   5 Gull 3 x64                   : -0.088  0.033    1737.5    3600    
   6 Fritz 15                     : -0.091  0.033    1739.5    3600    
   7 Andscacs 0.91b               : -0.099  0.033    1733.5    3600
So, the overperformance of Komodo is due to mostly KID lines in that opening suite of Dann Corbit.
Lyudmil Tsvetkov
Posts: 6052
Joined: Tue Jun 12, 2012 12:41 pm

Re: In response to the KID thread.

Post by Lyudmil Tsvetkov »

Laskos wrote:
Lyudmil Tsvetkov wrote:
Laskos wrote:
Lyudmil Tsvetkov wrote:
it depends on how you break down KID openings.
when I was saying SF and Houdini play the KID better than Komodo, I was referring to the mainline, human, Kasparovian fully closed game, with a twice defended storming pawn on f5/f4, c5/c4.

SF is playing that better than Komodo, because it has somewhat better connected pawns eval, which is suitable to fully closed KIDs, and somewhat better king safety.

if, however, the epd contains a large set of KID positions, including all possible KID lines (what was it E66 to E99 or something), among which there will be a substantial portion of lines exhibiting a totally different character than fully closed KIDs, then Komodo might very well be on top.

it is very difficult to generalise, there are positions which specific engine plays better, and others it plays worse. if the set has been favouring Komodo overall, then it will come on top, of course.

I wonder how did you manage to find so many KID lines, how long were they, possible to post one or 2?

not that it matters very much, as conditions were same for all, but why that 70cps restriction.
The question of Om Prakash was general. And for general KID positions Dann Corbit provided here a very useful EPD file:

Greasy Kid Stuff:

http://rybkaforum.net/cgi-bin/rybkaforu ... pid=572459

70cp is serious advantage right from the opening. Much higher, and I will get mostly Draws by adjudication (20 moves), much lower, the advantage is too mild to be conclusive (besides that, many openings have 30-40cp advantages to start with).
the question might have been general, but he posted a position, where the best line of play involves fully closing the game.
besides, humans generally refer to the KID as that black line with twice defended pawn on f4 and fully closed game. that is what captures their imagination.

as I assumed, the epd set is so wide, that it includes absolutely any opening, maybe excluding the Sicilian.
why not just E60-E90 code games there?

for example, while quickly browsing the set, I find the following:

[d]rnbqk2r/pp2ppbp/6p1/2p5/3PP3/2P2N2/P4PPP/R1BQKB1R w KQkq - 0 1

that seems to be a Gruenfeld, an open game, of very different nature

[d]rnbqkbnr/pppp1ppp/4p3/8/4P3/3P4/PPP2PPP/RNBQKBNR b KQkq - 0 1

another one. in what way is this a KID? obviously, this is a French, an open game

[d]rnbqk2r/ppp1ppbp/5np1/8/2QP4/2N2N2/PP2PPPP/R1B1KB1R b KQkq - 0 1

another Gruenfeld, open game

[d]rnbqk2r/ppppppbp/5np1/8/3P1B2/4PN2/PPP2PPP/RN1QKB1R b KQkq - 0 1

and this one is neither KID, nor Gruenfeld, just a random Queen's pawn opening

[d]rnbqkb1r/ppp2ppp/4pn2/3p4/4P3/3P1N2/PPPN1PPP/R1BQKB1R b KQkq - 0 1

another French

[d]r1bq1rk1/pppn1pbp/3p1np1/8/2PpP3/2N2NP1/PP3PBP/R1BQ1RK1 w - - 0 1

open KID

[d]r2qkb1r/pp1n1ppp/2p2n2/3pp3/4P1b1/3P1NP1/PPPN1PBP/R1BQ1RK1 b kq - 0 1

I do not know what this one is, Caro-Kann?, certainly not a KID

[d]rnbq1rk1/ppp1ppbp/3p1np1/8/2PP4/1P3NP1/PB2PP1P/RN1QKB1R b KQ - 0 1

some kind of a flank opening

etc., etc.

at a quick browse, there are many general KIDs there, but also many Gruenfelds, many French, many A code openings, many unspecified, etc.

closed KIDs are only about 2% of all positions, at a quick browse, but it is true another 20% could be closed by good engine play.

so that, when running a test suite, one should first try to establish what one is running.

otherwise, thanks for the test. :)
Most of 1245 openings provided by Dann Corbit in "greasy_KID_stuff" fall into general KID category of lines of varied depth. That you can pick several more dubious is no argument here. If you want a general opening file, I picked 8moves_GM.pgn file, and performed earlier the same test with these peculiar adjudication conditions. Stockfish, as expected, came clearly the best even in these adjudication conditions:

WILO:

Code: Select all

   # PLAYER                       : RATING  ERROR    POINTS  PLAYED     (%)   CFS(next)
   1 Stockfish 260517 64 BMI2     : 3130.2   27.6     495.0     704    70.3      97    
   2 Komodo 11.01 64-bit          : 3089.4   28.6     322.0     525    61.3     100    
   3 Houdini 5.01 Pro x64-popc    : 3032.6   24.7     405.0     741    54.7     100    
   4 Gull 3 x64                   : 2948.5   28.9     223.0     571    39.1      55    
   5 Deep Shredder 13 x64         : 2945.6   29.3     224.0     543    41.3      75    
   6 Fritz 15                     : 2929.8   31.7     185.0     491    37.7      60    
   7 Andscacs 0.91b               : 2923.8   30.0     187.0     507    36.9     ---    
Normalized ELO:

Code: Select all

   # PLAYER                       : RATING  ERROR    POINTS  PLAYED
   1 Stockfish 260517 64 BMI2     :  0.183  0.033    1943.0    3600  
   2 Komodo 11.01 64-bit          :  0.080  0.033    1859.5    3600      
   3 Houdini 5.01 Pro x64-popc    :  0.042  0.033    1834.5    3600    
   4 Deep Shredder 13 x64         : -0.068  0.033    1752.5    3600    
   5 Gull 3 x64                   : -0.088  0.033    1737.5    3600    
   6 Fritz 15                     : -0.091  0.033    1739.5    3600    
   7 Andscacs 0.91b               : -0.099  0.033    1733.5    3600
So, the overperformance of Komodo is due to mostly KID lines in that opening suite of Dann Corbit.
that is simply not true.

did you check all the positions?
because, even if hastily, I checked most.

and, as said, KID lines are widely interspersed with all kinds of different openings.

above positions are a very very tiny portion of non-KID ones.

I guess, if you want to do a sprt test with bounds 1-5 and 20 k games, you do not run 5 k with 1-3 bounds, the next 5 k with 0-6, the third 5 k with -1-4, and then the last 5 k with -3-0.

this simply does not make any sense at all.

Dann has provided a suite of 1000+ lines consisting of at least 15 different openings, following the rules of ECO codes.

if you want to do a real KID test, please try to get access to a KID only suite, and then research.

you can also try filtering out only the positions with KID code from above-mentioned suite, which will not be that many in the end.

and if you want real closed KID, then you should choose only positions with white pawn on f5 already/black on f4.

one person that can certainly confirm SF outplays by far Komodo in similar structures is Bram (Mourik), he has been running his matches and regularly commenting some games, seems to have quite a good understanding of the matter.

still, there seems to be something fishy in your conditions, SF leading Komodo by too little even in wide-spread openings, with this extremely low TC and adjudication rules, maybe it is the way SF would like to draw most of the games with repetitions.
Lyudmil Tsvetkov
Posts: 6052
Joined: Tue Jun 12, 2012 12:41 pm

Re: In response to the KID thread.

Post by Lyudmil Tsvetkov »

again, you are talking of mostly KID lines. what kind of mostly KID lines are straight French, Gruenfeld, Caro-Kann, Sokolsky, etc?

there is a massive quantity of those in the suite, please check.
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: In response to the KID thread.

Post by Laskos »

Lyudmil Tsvetkov wrote:again, you are talking of mostly KID lines. what kind of mostly KID lines are straight French, Gruenfeld, Caro-Kann, Sokolsky, etc?

there is a massive quantity of those in the suite, please check.
Most lines are derived from KID and transpose to KID. You seem to nitpick on irrelevant issues, without observing the bulk when it's inconvenient to you. What do you say about this KID EPD opening suite I myself built:
http://s000.tinyupload.com/?file_id=403 ... 6946166575

Preliminary results are almost identical to what I posted in the opening post. Again not KID? KID is only what you analyzed "better than engines" (your own claim)?

About "Greasy_KID_Stuff.epd" Dann Corbit provided and I used, better ask Dann. This seems to me a valid opening suite to test engines on KID positions.
carldaman
Posts: 2283
Joined: Sat Jun 02, 2012 2:13 am

Re: In response to the KID thread.

Post by carldaman »

Laskos wrote:
Lyudmil Tsvetkov wrote:
it depends on how you break down KID openings.
when I was saying SF and Houdini play the KID better than Komodo, I was referring to the mainline, human, Kasparovian fully closed game, with a twice defended storming pawn on f5/f4, c5/c4.

SF is playing that better than Komodo, because it has somewhat better connected pawns eval, which is suitable to fully closed KIDs, and somewhat better king safety.

if, however, the epd contains a large set of KID positions, including all possible KID lines (what was it E66 to E99 or something), among which there will be a substantial portion of lines exhibiting a totally different character than fully closed KIDs, then Komodo might very well be on top.

it is very difficult to generalise, there are positions which specific engine plays better, and others it plays worse. if the set has been favouring Komodo overall, then it will come on top, of course.

I wonder how did you manage to find so many KID lines, how long were they, possible to post one or 2?

not that it matters very much, as conditions were same for all, but why that 70cps restriction.
The question of Om Prakash was general. And for general KID positions Dann Corbit provided here a very useful EPD file:

Greasy Kid Stuff:

http://rybkaforum.net/cgi-bin/rybkaforu ... pid=572459

70cp is serious advantage right from the opening. Much higher, and I will get mostly Draws by adjudication (20 moves), much lower, the advantage is too mild to be conclusive (besides that, many openings have 30-40cp advantages to start with).
70cp is a serious advantage IF that is a correct engine evaluation. We should not assume the engines are correct, particularly in closed games, as we're dealing with the weakest phase of their game (opening + closed game).

In the classical KID, that evaluation score is likely not correct. Strong human players who know this and understand the KID have continued to use this opening even in the 'computer age', whether in corr chess or OTB. Even a top engine like SF can score very well from the Black side, in spite of its misevaluation [overcoming its own bad eval of the opening].

Regards,
CL
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: In response to the KID thread.

Post by Laskos »

Lyudmil Tsvetkov wrote:
that is simply not true.

did you check all the positions?
because, even if hastily, I checked most.

and, as said, KID lines are widely interspersed with all kinds of different openings.

above positions are a very very tiny portion of non-KID ones.
I counted 823 out of 1245 positions as KID (ECO E60-E99), and 200+ related or transposable to KID. So, the file of Dann Corbit is very relevant for testing on KID positions.