Chessprogams with the most chessknowing

bob · Post by **bob** » Fri Dec 19, 2014 11:06 pm

Adam Hair wrote:

Laskos wrote:
Laskos wrote:
Henryval wrote:Which Chessprogram(s) have the most implemented chessknowing. And are there more than a few proofs out there in the chesscommunity?

I wonder why we don't have any ranking about this issue.
I am not sure how to define this. Fixed depth play at depth=1 shows Komodo as the most knowledgeable engine. Generally, modern engines seem to have better knowledge than older ones.
I did some fast testing with fixed depth=1 for several top modern engines and engines reputed to have much knowledge. I included the ancient Shredder 6PB (year 2002 IIRC), which also claimed to have lots of knowledge at that time.
RR in cutechess-cli:
Code: Select all
Rank Name                        ELO   Games   Score   Draws
   1 Komodo 8                     92    1000     63%     18%
   2 Houdini 4                    78    1000     61%     26%
   3 Hannibal 1.4                 56    1000     58%     23%
   4 SF 14122014                  43    1000     56%     20%
   5 Hiarcs 14                   -22    1000     47%     16%
   6 Shredder 6PB &#40;2002&#41;        -302    1000     15%     14%
Finished match
Modern engines seem to prevail, especially those whose eval passed at some time through Larry's hands.

I think that at least part of the difference is due to Shredder not searching as many nodes as Komodo and Houdini.

I tried the same experiment this afternoon, primarily because I have been thinking about how to measure which engines handle various material imbalances better. Here are my results:

Code: Select all

   # PLAYER              &#58; RATING    POINTS  PLAYED    (%)
   1 Houdini 4           &#58;   78.0    1865.5    3029   61.6%
   2 Equinox 3.3         &#58;   76.7    1862.0    3040   61.2%
   3 Komodo 8            &#58;   52.1    1727.0    3000   57.6%
   4 Critter 1.6a        &#58;   37.9    1671.0    3000   55.7%
   5 Gull 3              &#58;   18.9    1589.5    3040   52.3%
   6 Stockfish 141213    &#58;   14.8    1564.0    3000   52.1%
   7 Gaviota 1.0         &#58;  -89.3    1093.0    3000   36.4%
   8 Fire 4              &#58; -189.1     697.0    3029   23.0%

Then I examined what Houdini and Fire do when made to search 1 ply. Houdini searches many more nodes on average. Here is an example:

Houdini

Code: Select all

1419023568.383 GUI->Adapter&#58; position startpos moves e2e4 e7e5 g1f3 g8f6 d2d4 f6e4 f1d3 d7d5 f3e5 b8d7 e5d7 c8d7 e1g1 f8d6 c2c4 c7c6 c4d5 c6d5 b1c3 e4c3 b2c3 e8g8 d1h5 f7f5 h5f3 g8h8 c1d2 d8h4 g2g3 h4g4 g1g2 a8c8 a1b1 b7b6 f1e1 f5f4 f3g4 d7g4 a2a4 g7g6 g2g1 g4d7 d3a6 c8b8 a6b5 b8d8 b5d7 d8d7 b1b5 d6c7 d2f4 c7f4 g3f4 f8f4 e1e5
1419023568.383 Adapter->Engine&#58; position startpos moves e2e4 e7e5 g1f3 g8f6 d2d4 f6e4 f1d3 d7d5 f3e5 b8d7 e5d7 c8d7 e1g1 f8d6 c2c4 c7c6 c4d5 c6d5 b1c3 e4c3 b2c3 e8g8 d1h5 f7f5 h5f3 g8h8 c1d2 d8h4 g2g3 h4g4 g1g2 a8c8 a1b1 b7b6 f1e1 f5f4 f3g4 d7g4 a2a4 g7g6 g2g1 g4d7 d3a6 c8b8 a6b5 b8d8 b5d7 d8d7 b1b5 d6c7 d2f4 c7f4 g3f4 f8f4 e1e5
1419023568.383 GUI->Adapter&#58; go depth 1
1419023568.383 Adapter->Engine&#58; go depth 1
1419023568.393 Engine->Adapter&#58; info multipv 1 depth 1 seldepth 8 score cp 8 time 0 nodes 565 nps 0 tbhits 0 hashfull 0 pv d7c7
1419023568.393 Adapter->GUI&#58; info multipv 1 depth 1 seldepth 8 score cp 8 time 0 nodes 565 nps 0 tbhits 0 hashfull 0 pv d7c7
1419023568.393 Engine->Adapter&#58; bestmove d7c7 ponder 0000
1419023568.393 Adapter->GUI&#58; bestmove d7c7 ponder 0000

Fire

Code: Select all

1419023696.997 GUI->Adapter&#58; position startpos moves e2e4 e7e5 g1f3 g8f6 d2d4 f6e4 f1d3 d7d5 f3e5 b8d7 e5d7 c8d7 e1g1 f8d6 c2c4 c7c6 c4d5 c6d5 b1c3 e4c3 b2c3 e8g8 d1h5 f7f5 h5f3 g8h8 c1d2 d8h4 g2g3 h4g4 g1g2 a8c8 a1b1 b7b6 f1e1 f5f4 f3g4 d7g4 a2a4 g7g6 g2g1 g4d7 d3a6 c8b8 a6b5 b8d8 b5d7 d8d7 b1b5 d6c7 d2f4 c7f4 g3f4 f8f4 e1e5
1419023696.997 Adapter->Engine&#58; position startpos moves e2e4 e7e5 g1f3 g8f6 d2d4 f6e4 f1d3 d7d5 f3e5 b8d7 e5d7 c8d7 e1g1 f8d6 c2c4 c7c6 c4d5 c6d5 b1c3 e4c3 b2c3 e8g8 d1h5 f7f5 h5f3 g8h8 c1d2 d8h4 g2g3 h4g4 g1g2 a8c8 a1b1 b7b6 f1e1 f5f4 f3g4 d7g4 a2a4 g7g6 g2g1 g4d7 d3a6 c8b8 a6b5 b8d8 b5d7 d8d7 b1b5 d6c7 d2f4 c7f4 g3f4 f8f4 e1e5
1419023696.997 GUI->Adapter&#58; go depth 1
1419023696.997 Adapter->Engine&#58; go depth 1
1419023696.997 Engine->Adapter&#58; info time 1 nodes 0 nps 0 tbhits 0 depth 1 score cp 0 pv h8g8
1419023696.997 Adapter->GUI&#58; info time 1 nodes 0 nps 0 tbhits 0 depth 1 score cp 0 pv h8g8
1419023696.997 Engine->Adapter&#58; info time 1 nodes 0 nps 0 tbhits 0 depth 1 score cp 4 pv h8g7
1419023696.997 Adapter->GUI&#58; info time 1 nodes 0 nps 0 tbhits 0 depth 1 score cp 4 pv h8g7
1419023696.998 Engine->Adapter&#58; info time 1 nodes 41 nps 41000 tbhits 0 depth 1 score cp 8 pv f4f3
1419023696.998 Adapter->GUI&#58; info time 1 nodes 41 nps 41000 tbhits 0 depth 1 score cp 8 pv f4f3
1419023696.998 Engine->Adapter&#58; bestmove f4f3
1419023696.998 Adapter->GUI&#58; bestmove f4f3

I suspect that part of the reason that Shredder is scoring much worse in your test is because it is not searching as many nodes.

Searching more or less nodes is the likely reason, but perhaps not for the reason you suspect. Programs treat a single ply differently. I have already listed much of this in another post in this thread, but some programs extend when giving check rather than when evading check. A one ply search means the former will search 2 plies deep while the latter will not. Some programs include checks in the q-search, some do not. Some have a sort of "threat search" between search and q-search, some do not. Some do singular extensions at the root, some do not. It is not just a "set depth = 1" and play a game, there are too many potential differences. See my other post for more details.

Steve Maughan · Post by **Steve Maughan** » Fri Dec 19, 2014 11:32 pm

It is an interesting question.

I think we can peel the onion a little more and ask the following questions:

1. Which engine has the most knowledge of special positions (like the ones on this thread)

2. Which engine has the most parameters / factors in its evaluation function

3. Which engine has the most accurate static evaluation function

Note that an engine which tops the league in question 2 will not necessarily rank highly in question 3, since the coefficients applied to each factor may be incorrect. And as the number of factors increases and evaluate more subtle elements of a position, it becomes more difficult to establish the right value for the parameter.

And of course the engine which tops the league in question 3 may be not be the strongest engine. As well as search efficiencies, the time required to perform the evaluation may be so long that the engine isn't able to search as deeply as others.

Steve

Laskos · Post by **Laskos** » Sat Dec 20, 2014 12:19 am

Adam Hair wrote:
I suspect that part of the reason that Shredder is scoring much worse in your test is because it is not searching as many nodes.

I fed the same startpos to Shredder 6 PB, and it searched 63 nodes. It only partly explains the difference, as at fixed nodes one would account for the fact that Shredder 6 is ~3 times slower than Houdini (or Fire). Pity I seem to not be able to use games at fixed nodes with many engines.

Laskos · Post by **Laskos** » Sat Dec 20, 2014 12:23 am

bob wrote:
This doesn't work very well. depth=1 does NOT mean the same thing to everyone. To some it is simply a 1 ply search followed by q-search. Some allow check extensions before going to q-search. Some allow check extensions AFTER q-search. Some have a couple of plies of threats between basic search and start of q-search. Some extend at the root if in check, some don't. Some extend if giving check at the root, some don't. Etc. I played a match Crafty vs Stockfish a while back to compare evaluations. It took quite a bit of work to make them search the same basic tree space.

I ended up having to modify code in both programs to reach a consistent meaning for "set depth = 1". After all the work was done, the net conclusion from playing 100K games was "no statistical difference".

Yes, correct, but it seems hard to measure comprehensibly the static eval without messing up with the source, where available.

Ferdy · Post by **Ferdy** » Sat Dec 20, 2014 12:45 am

Henryval wrote:Which Chessprogram(s) have the most implemented chessknowing. And are there more than a few proofs out there in the chesscommunity?

I wonder why we don't have any ranking about this issue.

Another position, white can only draw. Score close to zero is better. Fire 4 is impressive so far.
Hash 64mb, threads 1, time 1sec.
[d]7k/R7/7P/6K1/8/8/2b5/8 w - - 0 1

Code: Select all

 1 id name Fire 4 x64 &#40;time 749, score 23&#41;
 2 id name Critter 1.6a 64-bit &#40;time 860, score 51&#41;
 3 id name Stockfish 131214 64 POPCNT &#40;time 859, score 54&#41;
 4 id name Houdini 4 x64 &#40;time 1000, score 84&#41;
 5 id name Strelka 6 w32 &#40;time 1000, score 103&#41;
 6 id name Texel 1.04 64-bit &#40;time 998, score 216&#41;
 7 id name Komodo 6 64-bit  &#40;time 826, score 279&#41;
 8 id name Booot 5.2.0&#40;64&#41; &#40;time 16, score 319&#41;
 9 id name HIARCS 14 WCSC &#40;time 874, score 332&#41;
10 id name Equinox 3.30 x64mp &#40;time 890, score 371&#41;
11 id name Naum 4.6 &#40;time 811, score 387&#41;
12 id name spark-1.0 &#40;time 1014, score 392&#41;
13 id name Nebula 2.0 &#40;time 812, score 395&#41;
14 id name cheng4 0.36c &#40;time 914, score 417&#41;
15 id name Maverick 0.51 x64 &#40;time 702, score 437&#41;
16 id name DiscoCheck 5.2.1 &#40;time 795, score 442&#41;
17 id name Atlas 3.70em x64 &#40;time 795, score 447&#41;
18 id name Fruit reloaded 2.1 &#40;time 920, score 450&#41;
19 id name Andscacs 0.71 &#40;time 459, score 459&#41;
20 id name Rodent 1.6 &#40;build 6&#41; &#40;time 702, score 484&#41;
21 id name Rybka 2.3.2a mp  &#40;time 671, score 488&#41;
22 id name Senpai 1.0 &#40;time 967, score 491&#41;
23 id name Arasan 17.4 &#40;time 1014, score 496&#41;
24 id name Protector 1.7.0 &#40;time 936, score 500&#41;
25 id name Ruffian 1.0.5 &#40;time 820, score 503&#41;
26 id name Bobcat 3.25 &#40;time 812, score 505&#41;
27 id name GreKo 12.1 &#40;time 687, score 510&#41;
28 id name Gaviota v1.0 &#40;time 936, score 514&#41;
29 id name Vajolet2 1.45 &#40;time 877, score 526&#41;
30 id name Hannibal 1.4x64 &#40;time 1014, score 529&#41;
31 id name Octochess revision 5190 &#40;time 577, score 541&#41;
32 id name Deuterium v14.3.34.130 &#40;time 765, score 561&#41;
33 id name DisasterArea-1.54 &#40;time 999, score 561&#41;
34 id name GNU Chess 5.60-64 &#40;time 873, score 575&#41;
35 id name Quazar 0.4 x64 &#40;time 1046, score 593&#41;
36 id name Rhetoric 1.4.1 x64 &#40;time 639, score 676&#41;
37 id name iCE 2.0 v2240 x64/popcnt &#40;time 203, score 790&#41;

I am using a very nice book.
100 Endgames You Must Know (3rd edition) by Jesus de la Villa

kranium · Post by **kranium** » Sat Dec 20, 2014 12:55 am

I think a fast/efficient search and high NPS can often overcome a weak or minimal eval function at ultra-fast/very fast TCs.
But as TCs increase, engines with expensive (large) eval functions will reach sufficient depth to equalize...
and eventually, at a certain point (as TC increases) any previous advantage in NPS and depth is mitigated or nullified.

At that point, the program with the best (most accurate) eval should prove strongest.
This has been clearly shown by Komodo's development and evolution over the past few years.

Larry Kaufman is the expert in data-mining and quantifying statistical results into chess programming friendly terms and tables.

If we look at his early 1999 publication on material imbalances:
https://home.comcast.net/~danheisman/Ar ... alance.htm
his current contributions to Komodo are surely broader, vastly matured, and superior.

It makes sense to me that today, the strongest chess engine at LTC contains the most (complete and accurate) 'chess knowledge'.

Can there be any other anwser to this question other than Komodo 8 at this time?
http://tcec.chessdom.com/live.php

cdani · Post by **cdani** » Sat Dec 20, 2014 1:02 am

kranium wrote:I think a fast/efficient search and high NPS can often overcome a weak or minimal eval function at ultra-fast/very fast TCs.
But as TCs increase, engines with expensive (large) eval functions will reach sufficient depth to equalize...
and eventually, at a certain point (as TC increases) any previous advantage in NPS and depth is mitigated or nullified.

At that point, the program with the best (most accurate) eval should prove strongest.
This has been clearly shown by Komodo's development and evolution over the past few years.

Larry Kaufman is the expert in data-mining and quantifying statistical results into chess programming friendly terms and tables.

If we look at his early 1999 publication on material imbalances:
https://home.comcast.net/~danheisman/Ar ... alance.htm
his current contributions to Komodo are surely broader, vastly matured, and superior.

It makes sense to me that today, the strongest chess engine at LTC contains the most (complete and accurate) 'chess knowledge'.

Can there be any other anwser to this question other than Komodo 8 at this time?
http://tcec.chessdom.com/live.php

I'm sure it's like this also. The specialized evals are the future.

Ferdy · Post by **Ferdy** » Sat Dec 20, 2014 1:11 am

kranium wrote:I think a fast/efficient search and high NPS can often overcome a weak or minimal eval function at ultra-fast/very fast TCs.
But as TCs increase, engines with expensive (large) eval functions will reach sufficient depth to equalize...
and eventually, at a certain point (as TC increases) any previous advantage in NPS and depth is mitigated or nullified.

At that point, the program with the best (most accurate) eval should prove strongest.
This has been clearly shown by Komodo's development and evolution over the past few years.

Larry Kaufman is the expert in data-mining and quantifying statistical results into chess programming friendly terms and tables.

If we look at his early 1999 publication on material imbalances:
https://home.comcast.net/~danheisman/Ar ... alance.htm
his current contributions to Komodo are surely broader, vastly matured, and superior.

It makes sense to me that today, the strongest chess engine at LTC contains the most (complete and accurate) 'chess knowledge'.

Can there be any other anwser to this question other than Komodo 8 at this time?
http://tcec.chessdom.com/live.php

There are positions that an engine will not be able to solve even if given more time. Engines that have good estimation of the positions are usually search-efficient.

kranium · Post by **kranium** » Sat Dec 20, 2014 1:21 am

Ferdy wrote: There are positions that an engine will not be able to solve even if given more time. Engines that have good estimation of the positions are usually search-efficient.

Agreed Ferdinand...
I just believe that what separates Komodo from the rest of the pack at this date and time is it's eval (and it's just getting better and better)

When Komodo wins, it's probably not out-searching SF.

Ferdy · Post by **Ferdy** » Sat Dec 20, 2014 1:23 am

Henryval wrote:Which Chessprogram(s) have the most implemented chessknowing. And are there more than a few proofs out there in the chesscommunity?

I wonder why we don't have any ranking about this issue.

One more, white is fine. Fire tops again.
[d]8/8/5k2/8/8/4qBB1/6K1/8 w - - 0 1

Code: Select all

 1 id name Fire 4 x64 &#40;time 842, score -12&#41;
 2 id name Komodo 6 64-bit  &#40;time 748, score -32&#41;
 3 id name Texel 1.04 64-bit &#40;time 936, score -72&#41;
 4 id name Booot 5.2.0&#40;64&#41; &#40;time 15, score -147&#41;
 5 id name Octochess revision 5190 &#40;time 639, score -162&#41;
 6 id name Atlas 3.70em x64 &#40;time 859, score -178&#41;
 7 id name DiscoCheck 5.2.1 &#40;time 951, score -182&#41;
 8 id name Maverick 0.51 x64 &#40;time 717, score -198&#41;
 9 id name Ruffian 1.0.5 &#40;time 780, score -207&#41;
10 id name Andscacs 0.71 &#40;time 942, score -221&#41;
11 id name Houdini 4 x64 &#40;time 1000, score -222&#41;
12 id name Strelka 6 w32 &#40;time 1000, score -225&#41;
13 id name Equinox 3.30 x64mp &#40;time 951, score -230&#41;
14 id name Critter 1.6a 64-bit &#40;time 807, score -236&#41;
15 id name Naum 4.6 &#40;time 796, score -256&#41;
16 id name Fruit reloaded 2.1 &#40;time 998, score -285&#41;
17 id name Arasan 17.4 &#40;time 1031, score -296&#41;
18 id name Nebula 2.0 &#40;time 951, score -307&#41;
19 id name Rodent 1.6 &#40;build 6&#41; &#40;time 827, score -308&#41;
20 id name HIARCS 14 WCSC &#40;time 889, score -311&#41;
21 id name Rybka 2.3.2a mp  &#40;time 859, score -324&#41;
22 id name Deuterium v14.3.34.130 &#40;time 905, score -335&#41;
23 id name Stockfish 131214 64 POPCNT &#40;time 921, score -337&#41;
24 id name Protector 1.7.0 &#40;time 936, score -349&#41;
25 id name DisasterArea-1.54 &#40;time 936, score -362&#41;
26 id name cheng4 0.36c &#40;time 724, score -369&#41;
27 id name Hannibal 1.4x64 &#40;time 655, score -379&#41;
28 id name Vajolet2 1.45 &#40;time 784, score -390&#41;
29 id name Senpai 1.0 &#40;time 889, score -391&#41;
30 id name GreKo 12.1 &#40;time 983, score -397&#41;
31 id name Gaviota v1.0 &#40;time 764, score -400&#41;
32 id name Bobcat 3.25 &#40;time 905, score -413&#41;
33 id name GNU Chess 5.60-64 &#40;time 876, score -430&#41;
34 id name iCE 2.0 v2240 x64/popcnt &#40;time 343, score -438&#41;
35 id name Rhetoric 1.4.1 x64 &#40;time 359, score -475&#41;
36 id name Quazar 0.4 x64 &#40;time 1061, score -484&#41;
37 id name spark-1.0 &#40;time 1014, score -568&#41;

Chessprogams with the most chessknowing

Re: Chessprogams with the most chessknowing

Re: Chessprogams with the most chessknowing

Re: Chessprogams with the most chessknowing

Re: Chessprogams with the most chessknowing

Re: Chessprogams with the most chessknowing

Re: Chessprogams with the most chessknowing

Re: Chessprogams with the most chessknowing

Re: Chessprogams with the most chessknowing

Re: Chessprogams with the most chessknowing

Re: Chessprogams with the most chessknowing