In response to Uri's thread about positional understanding

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

In response to Uri's thread about positional understanding

Post by Laskos »

In this thread
http://talkchess.com/forum/viewtopic.php?t=66530
Uri showed some Stockfish performance in opening/early midgame, but just for one position.

In the past, I tried to build a positional test-suite for openings, with no tactical shots. I used: human and engine databases CB MegaBase 2017, CB Live Book, online Chess Tempo database and Noomen.ctg opening book, and several engines analyzing.
My last version was Openings200beat07.epd is listed here:
http://talkchess.com/forum/viewtopic.ph ... t&start=49

Although Dann Corbit shows that dozens of positions are not solved by corroborated engines even at long TC, I am pretty satisfied by this positional opening suite containing 200 positions. I estimate that out of the 200 positions, about 15 have wrong solutions, and about 15 are too hard for any current engine at any time control.

I tested in Polyglot with these settings:

Code: Select all

-min-depth 3 -max-depth 99 -min-time 0.1 -max-time 5 -depth-delta 2
So, the maximum allowed time is 5s/position. The engines are on 4 threads with 512MB Hash. Speed of PC is of 3.8GHz modern i7.
I used 1000 positions suite, these 200 openings repeated 5 times, to dampen statistical noise, and by jackknifing, I get that 1 standard deviation of the difference in results is about 9-10 points out of 1000.

The results:

Code: Select all

Komodo 11.2.2
score=666/1000 [averages on correct positions: depth=13.2 time=0.71 nodes=3853146]

Houdini 6.03
score=656/1000 [averages on correct positions: depth=14.3 time=0.87 nodes=6708492]

Stockfish 9
score=641/1000 [averages on correct positions: depth=14.0 time=0.74 nodes=4793389]

Andscacs 0.93
score=598/1000 [averages on correct positions: depth=12.2 time=0.69 nodes=3202933]

Shredder 13
score=573/1000 [averages on correct positions: depth=14.3 time=0.79 nodes=4678511]
Although for the first place there is a tight battle, Komodo 11.2.2 seems almost certainly performing better here than Stockfish 9. Also, Andscacs 0.93 seems to perform significantly better than Shredder 13.
Werewolf
Posts: 1796
Joined: Thu Sep 18, 2008 10:24 pm

Re: In response to Uri's thread about positional understandi

Post by Werewolf »

What was the score of Stockfish 8?

I'd love to see results of others, such as Giraffe, HIARCS, Junior etc, but SF8 would be very interesting.
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: In response to Uri's thread about positional understandi

Post by Laskos »

Werewolf wrote:What was the score of Stockfish 8?

I'd love to see results of others, such as Giraffe, HIARCS, Junior etc, but SF8 would be very interesting.
Here is with Stockfish 8 included. I might test some other engines, if they comply with Polyglot.

Code: Select all

Komodo 11.2.2 
score=666/1000 [averages on correct positions: depth=13.2 time=0.71 nodes=3853146] 

Houdini 6.03 
score=656/1000 [averages on correct positions: depth=14.3 time=0.87 nodes=6708492] 

Stockfish 9 
score=641/1000 [averages on correct positions: depth=14.0 time=0.74 nodes=4793389] 

Stockfish 8
score=628/1000 [averages on correct positions: depth=13.9 time=0.80 nodes=4802227]

Andscacs 0.93 
score=598/1000 [averages on correct positions: depth=12.2 time=0.69 nodes=3202933] 

Shredder 13 
score=573/1000 [averages on correct positions: depth=14.3 time=0.79 nodes=4678511]
Dicaste
Posts: 142
Joined: Mon Apr 16, 2012 7:23 pm
Location: Istanbul, TURKEY

Re: In response to Uri's thread about positional understandi

Post by Dicaste »

RomiChess would be cool too.
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: In response to Uri's thread about positional understandi

Post by Laskos »

Dicaste wrote:RomiChess would be cool too.
Got what seem to be the latest RomiChess and Giraffe. I am not sure they obey literally the Polyglot commands, and they run only on one thread, so their result is deflated to other engines using 4 threads. I also included the latest Texel.

Code: Select all

Komodo 11.2.2 
score=666/1000 [averages on correct positions: depth=13.2 time=0.71 nodes=3853146] 

Houdini 6.03 
score=656/1000 [averages on correct positions: depth=14.3 time=0.87 nodes=6708492] 

Stockfish 9 
score=641/1000 [averages on correct positions: depth=14.0 time=0.74 nodes=4793389] 

Stockfish 8 
score=628/1000 [averages on correct positions: depth=13.9 time=0.80 nodes=4802227] 

Andscacs 0.93 
score=598/1000 [averages on correct positions: depth=12.2 time=0.69 nodes=3202933] 

Shredder 13 
score=573/1000 [averages on correct positions: depth=14.3 time=0.79 nodes=4678511]

Texel 1.08a8
score=489/1000 [averages on correct positions: depth=10.3 time=0.53 nodes=3053861]

Giraffe
score=410/1000 [averages on correct positions: depth=10.0 time=0.68 nodes=167994]

RomiChessP3n default
score=392/1000 [averages on correct positions: depth=11.7 time=0.88 nodes=4934412]
matejst
Posts: 364
Joined: Mon May 14, 2007 8:20 pm
Full name: Boban Stanojević

Re: In response to Uri's thread about positional understandi

Post by matejst »

Thanks, Kai. Very interesting testing like always.

Could you write what version of Giraffe you used?

Then, there are a few engines I believe play a good positional brand of chess, so if you could test engines like Wasp and iCE, I would be very grateful.

Finally, did you test engines understanding of endings? I noticed that there is a trend of removing endgame knowledge lately, and I would be very interested in your findings.
zenpawn
Posts: 349
Joined: Sat Aug 06, 2016 8:31 pm
Location: United States

Re: In response to Uri's thread about positional understandi

Post by zenpawn »

Did you end up taking any of Dann's alternate (better?) best moves into account? (The other thread ends after his findings.) If so, do you have an updated suite? Thanks.
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: In response to Uri's thread about positional understandi

Post by Laskos »

zenpawn wrote:Did you end up taking any of Dann's alternate (better?) best moves into account? (The other thread ends after his findings.) If so, do you have an updated suite? Thanks.
I believe I have a beta08 suite somewhere, which takes some inspiration from Dann's analysis, but I didn't like it. I believe 69 is a way too high number of wrong proposed solutions, top 3 engines (each one of them) solve 160+ positions from 200 at some 5 min/move (not all of positions the same for each engine), and generally I do not trust too much computer analysis in openings, where I mostly used large databases of human and computer games and had a reasonable statistic of outcomes. Seems more reliable in this case, and I kept beta07 as my reference as of now.
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: In response to Uri's thread about positional understandi

Post by Laskos »

matejst wrote:Thanks, Kai. Very interesting testing like always.

Could you write what version of Giraffe you used?

Then, there are a few engines I believe play a good positional brand of chess, so if you could test engines like Wasp and iCE, I would be very grateful.

Finally, did you test engines understanding of endings? I noticed that there is a trend of removing endgame knowledge lately, and I would be very interested in your findings.
Giraffe_161023_x64
Is this the last one? I don't know of a newer one.

I will try to add some other results.

I had in the past some endgame results, but I believe I lost them. Some tests would be easy to perform, as I have 6-men Syzygy on SSD and many easy and hard suites of 5- and 6-men positions.
carldaman
Posts: 2283
Joined: Sat Jun 02, 2012 2:13 am

Re: In response to Uri's thread about positional understandi

Post by carldaman »

Laskos wrote:
matejst wrote:Thanks, Kai. Very interesting testing like always.

Could you write what version of Giraffe you used?

Then, there are a few engines I believe play a good positional brand of chess, so if you could test engines like Wasp and iCE, I would be very grateful.

Finally, did you test engines understanding of endings? I noticed that there is a trend of removing endgame knowledge lately, and I would be very interested in your findings.
Giraffe_161023_x64
Is this the last one? I don't know of a newer one.

I will try to add some other results.

I had in the past some endgame results, but I believe I lost them. Some tests would be easy to perform, as I have 6-men Syzygy on SSD and many easy and hard suites of 5- and 6-men positions.
Giraffe_161023_x64, although the latest released, is known to be weak/buggy. The strongest Giraffe is (from) 20150908.

CL