Ten years of Computer Chess revisited

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Harvey Williamson, bob

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
User avatar
Laskos
Posts: 8741
Joined: Wed Jul 26, 2006 8:21 pm

Ten years of Computer Chess revisited

Post by Laskos » Fri Mar 13, 2015 2:49 pm

Paralleling two and a half year old thread:
http://www.talkchess.com/forum/viewtopic.php?t=45902
I took the Shredder 9 UCI as baseline, which appeared almost exactly 10 years ago. Chessbase article on the 1st of March 2005 on the new Shredder:
Shredder 9 on top of the world
http://en.chessbase.com/post/shredder-9 ... -the-world

The SSDF Rating List - July 29, 2005
http://www.chessusa.com/about/ratings/ssdflist.html

Code: Select all

                                           Rating   +     -  Games   Won  Oppo
                                           ------  ---   --- -----   ---  ----
   1 Shredder 9.0 UCI  256MB Athlon 1200 MHz 2821   28   -27   704   67%  2697
   2 Shredder 8.0 CB  256MB Athlon 1200 MHz  2805   23   -22  1115   71%  2648
   3 Shredder 7.04 UCI 256MB Athlon 1200 MHz 2804   22   -21  1133   69%  2663
   4 Junior 9.0  256MB Athlon 1200 MHz       2789   27   -26   745   67%  2666
   5 Deep Fritz 8.0  256MB Athlon 1200 MHz   2783   24   -23   942   70%  2633
   6 Junior 8.0  256MB Athlon 1200 MHz       2766   24   -24   888   65%  2660
   7 Shredder 7.0  256MB Athlon 1200 MHz     2765   26   -25   841   69%  2629
   8 Deep Fritz 7.0  256MB Athlon 1200 MHz   2764   24   -23   938   65%  2654
   9 Fritz 8.0  256MB Athlon 1200 MHz        2753   21   -20  1206   63%  2659
  10 Deep Junior 8.0  256MB Athlon 1200 MHz  2750   30   -29   567   63%  2659

I picked the latest March 2015 Stockfish dev version and pitted it against Shredder 9 at 5s + 0.05s time control, 1000 games:

Score of Stockfish 12.03.2015 vs Shredder 9: 967 - 3 - 30 [0.982] 1000
ELO difference: 695
Finished match

Average game was ~43 moves long, compared to ~60 in Stockfish self-tests, so there were many fast wins by Stockfish against Shredder 9.

Then, to quantify the eval and search improvements I performed the following:
1/ Shredder 9 UCI follows literally the UCI command "go nodes X" even for small X. Stockfish does not for small X.
2/ Stockfish foolows well "go depth N" command, so to test the eval I used "go depth 1" for Stockfish.
3/ Observed nodes of Stockfish on 20 positions to depth=1 are: endgame~30 nodes, opening~80 nodes. I took the average 60 nodes.
4/ I pitted Stockfish depth=1 against Shredder 9 nodes=60 and the mostly eval result is the following:

Score of Stockfish 12.03.2015 vs Shredder 9: 704 - 158 - 138 [0.773] 1000
ELO difference: 213
Finished match

If this has some meaning and the node count is total, including QS nodes for both engines, the breakdown of improvement during the last 10 years would be:

~400 Elo points from improved search.
~200 Elo points from improved eval.
~100 Elo points from hardware, on equal, but not suited to Shredder 9 hardware (64 bit, new instructions, compiler optimizations due to new hardware).

Vinvin
Posts: 4186
Joined: Thu Mar 09, 2006 8:40 am
Full name: Vincent Lejeune

Re: Ten years of Computer Chess revisited

Post by Vinvin » Fri Mar 13, 2015 3:05 pm

Nice post ! 10 years seems not so far but big improvements ! :-)

User avatar
Steve Maughan
Posts: 1048
Joined: Wed Mar 08, 2006 7:28 pm
Location: Florida, USA
Contact:

Re: Ten years of Computer Chess revisited

Post by Steve Maughan » Fri Mar 13, 2015 3:44 pm

Kai -

Great research!

So there has been an average gain of 70 ELO per year for the last 10 years!!!

If you'd ask me to in 2005 to predict the average software improvement per year until 2015, I think I may have said 20 ELO tops i.e. maybe we'd have a 3000 ELO engine.

Steve
http://www.chessprogramming.net - Maverick Chess Engine

User avatar
Laskos
Posts: 8741
Joined: Wed Jul 26, 2006 8:21 pm

Re: Ten years of Computer Chess revisited

Post by Laskos » Fri Mar 13, 2015 4:16 pm

Steve Maughan wrote:Kai -

Great research!

So there has been an average gain of 70 ELO per year for the last 10 years!!!

If you'd ask me to in 2005 to predict the average software improvement per year until 2015, I think I may have said 20 ELO tops i.e. maybe we'd have a 3000 ELO engine.

Steve
Me too :)
The last yearly releases of Shredder, Fritz and Junior, the top dogs of those times, were coming in portions of 20 Elo points gains on the same hardware. It seemed that there is a wall around 3000 SSDF Elo.

To note that the typical good hardware in those days was a 3GHz single core Intel or AMD, which is only ~8-10 times slower (~3 doublings) compared to typical good 4 core i7 of today. The purely hardware progress in the last 10 years was no more than 300 Elo points (at bullet TC), which is lower than even solely software search techniques improvements.

IWB
Posts: 1539
Joined: Thu Mar 09, 2006 1:02 pm

Re: Ten years of Computer Chess revisited

Post by IWB » Fri Mar 13, 2015 7:14 pm

Laskos wrote: ...
The last yearly releases of Shredder, Fritz and Junior, the top dogs of those times, were coming in portions of 20 Elo points gains on the same hardware....
Intersting post, however the quoted point isn't quite true, at least not for Shredder. I think we tend to underestimate the progress back then.

I didn't bother to test S9 but S10, 11, 12.

IPON results:

Code: Select all

69 Deep Shredder 12             :   2799      4   8228.5   18500   44.5%
106 Deep Shredder 11             :   2681     11   1412.0    2700   52.3%
144 Deep Shredder 10             :   2584      9   1754.0    4400   39.9%
That is for 5m+3s Ponder ON, 1 thread and the same Hardware.

CEGT 40/4 (to stay at short time controls) Ponder OFF 1CPU

Code: Select all

270 	Deep Shredder 12 x64 1CPU 	2800 	6 	6 	16800 	46.3% 	2831 	34.5%
444 	Deep Shredder 11 x64 1CPU 	2684 	9 	9 	3770 	55.1% 	2646 	32.7%
647 	Deep Shredder 10 x64 1CPU 	2589 	16 	16 	980 	43.7% 	2636 	33.5%
717 	Shredder 9.1 	             2541 	7 	7 	6230 	54.5% 	2506 	25.8%
If S9 was released early 2005, S10 was compiled on the 12th of May 2006, S11 20th of October 2007 and S12 30th September of 2009.
Between S9 and S12 were ~4.5 years and 86 Elo per release or 58 Elo per year in average. Your 20 Elo impression seems to be to small.
(If you just extrapolate that 58 CEGT Elo to today Shredder X would be at (5.5x58)+2800 = 3119 compared to 3208 for Stock6 - but that is just for fun and somehow "rubbish" :-) )

However, I didn't check the other "old school" engines - someone might be interested ...

Bye
Ingo

APassionForCriminalJustic
Posts: 415
Joined: Sat May 24, 2014 7:16 am

Re: Ten years of Computer Chess revisited

Post by APassionForCriminalJustic » Fri Mar 13, 2015 7:54 pm

Laskos wrote:Paralleling two and a half year old thread:
http://www.talkchess.com/forum/viewtopic.php?t=45902
I took the Shredder 9 UCI as baseline, which appeared almost exactly 10 years ago. Chessbase article on the 1st of March 2005 on the new Shredder:
Shredder 9 on top of the world
http://en.chessbase.com/post/shredder-9 ... -the-world

The SSDF Rating List - July 29, 2005
http://www.chessusa.com/about/ratings/ssdflist.html

Code: Select all

                                           Rating   +     -  Games   Won  Oppo
                                           ------  ---   --- -----   ---  ----
   1 Shredder 9.0 UCI  256MB Athlon 1200 MHz 2821   28   -27   704   67%  2697
   2 Shredder 8.0 CB  256MB Athlon 1200 MHz  2805   23   -22  1115   71%  2648
   3 Shredder 7.04 UCI 256MB Athlon 1200 MHz 2804   22   -21  1133   69%  2663
   4 Junior 9.0  256MB Athlon 1200 MHz       2789   27   -26   745   67%  2666
   5 Deep Fritz 8.0  256MB Athlon 1200 MHz   2783   24   -23   942   70%  2633
   6 Junior 8.0  256MB Athlon 1200 MHz       2766   24   -24   888   65%  2660
   7 Shredder 7.0  256MB Athlon 1200 MHz     2765   26   -25   841   69%  2629
   8 Deep Fritz 7.0  256MB Athlon 1200 MHz   2764   24   -23   938   65%  2654
   9 Fritz 8.0  256MB Athlon 1200 MHz        2753   21   -20  1206   63%  2659
  10 Deep Junior 8.0  256MB Athlon 1200 MHz  2750   30   -29   567   63%  2659

I picked the latest March 2015 Stockfish dev version and pitted it against Shredder 9 at 5s + 0.05s time control, 1000 games:

Score of Stockfish 12.03.2015 vs Shredder 9: 967 - 3 - 30 [0.982] 1000
ELO difference: 695
Finished match

Average game was ~43 moves long, compared to ~60 in Stockfish self-tests, so there were many fast wins by Stockfish against Shredder 9.

Then, to quantify the eval and search improvements I performed the following:
1/ Shredder 9 UCI follows literally the UCI command "go nodes X" even for small X. Stockfish does not for small X.
2/ Stockfish foolows well "go depth N" command, so to test the eval I used "go depth 1" for Stockfish.
3/ Observed nodes of Stockfish on 20 positions to depth=1 are: endgame~30 nodes, opening~80 nodes. I took the average 60 nodes.
4/ I pitted Stockfish depth=1 against Shredder 9 nodes=60 and the mostly eval result is the following:

Score of Stockfish 12.03.2015 vs Shredder 9: 704 - 158 - 138 [0.773] 1000
ELO difference: 213
Finished match

If this has some meaning and the node count is total, including QS nodes for both engines, the breakdown of improvement during the last 10 years would be:

~400 Elo points from improved search.
~200 Elo points from improved eval.
~100 Elo points from hardware, on equal, but not suited to Shredder 9 hardware (64 bit, new instructions, compiler optimizations due to new hardware).
Hahahahahahaha. I am surprised that Shredder achieved three wins there in your first result.

User avatar
Laskos
Posts: 8741
Joined: Wed Jul 26, 2006 8:21 pm

Re: Ten years of Computer Chess revisited

Post by Laskos » Fri Mar 13, 2015 8:02 pm

IWB wrote:
Laskos wrote: ...
The last yearly releases of Shredder, Fritz and Junior, the top dogs of those times, were coming in portions of 20 Elo points gains on the same hardware....
Intersting post, however the quoted point isn't quite true, at least not for Shredder. I think we tend to underestimate the progress back then.
I should have been more specific, referring only to 2003-2005 interval, when only SSDF was a reliable rating list and the progress on that list was close to some 20 points yearly. Being long time control, it also compresses ratings. Also, Elostat (if that is what they use) doesn't help either, it compresses the ratings a bit further. So, for the period 2003-2005, some 30-40 Elo points yearly in Blitz using Ordo. Later, Rybka, CEGT, CCRL, IPON, FastGM, and other new things worth mentioning. But you are right, the progress after 2005 was faster for some time, at least for Shredder.

I didn't bother to test S9 but S10, 11, 12.

IPON results:

Code: Select all

69 Deep Shredder 12             :   2799      4   8228.5   18500   44.5%
106 Deep Shredder 11             :   2681     11   1412.0    2700   52.3%
144 Deep Shredder 10             :   2584      9   1754.0    4400   39.9%
That is for 5m+3s Ponder ON, 1 thread and the same Hardware.

CEGT 40/4 (to stay at short time controls) Ponder OFF 1CPU

Code: Select all

270 	Deep Shredder 12 x64 1CPU 	2800 	6 	6 	16800 	46.3% 	2831 	34.5%
444 	Deep Shredder 11 x64 1CPU 	2684 	9 	9 	3770 	55.1% 	2646 	32.7%
647 	Deep Shredder 10 x64 1CPU 	2589 	16 	16 	980 	43.7% 	2636 	33.5%
717 	Shredder 9.1 	             2541 	7 	7 	6230 	54.5% 	2506 	25.8%
If S9 was released early 2005, S10 was compiled on the 12th of May 2006, S11 20th of October 2007 and S12 30th September of 2009.
Between S9 and S12 were ~4.5 years and 86 Elo per release or 58 Elo per year in average. Your 20 Elo impression seems to be to small.
(If you just extrapolate that 58 CEGT Elo to today Shredder X would be at (5.5x58)+2800 = 3119 compared to 3208 for Stock6 - but that is just for fun and somehow "rubbish" :-) )

However, I didn't check the other "old school" engines - someone might be interested ...

Bye
Ingo

User avatar
fern
Posts: 8755
Joined: Sun Feb 26, 2006 3:07 pm

Re: Ten years of Computer Chess revisited

Post by fern » Fri Mar 13, 2015 8:15 pm

Great research, Kai.
At last I have a full understanding why it take me so much effort to crush Stockfish...

Terry McCracken
Posts: 15844
Joined: Wed Aug 01, 2007 2:16 am
Location: Canada

Re: Ten years of Computer Chess revisited

Post by Terry McCracken » Fri Mar 13, 2015 8:22 pm

fern wrote:Great research, Kai.
At last I have a full understanding why it take me so much effort to crush Stockfish...
:lol: :lol: :lol:

IMO chess players have faced the first wave of A.I. regardless of the arguments it simulates chess play and it simulates it pretty damn well!

The world will wake up to a revolution in a few decades and may find itself checkmated!
Terry McCracken

IWB
Posts: 1539
Joined: Thu Mar 09, 2006 1:02 pm

Re: Ten years of Computer Chess revisited

Post by IWB » Fri Mar 13, 2015 8:30 pm

Laskos wrote: I should have been more specific, referring only to 2003-2005 interval, when only SSDF was a reliable rating list and the progress on that list was close to some 20 points yearly. Being long time control, it also compresses ratings. Also, Elostat (if that is what they use) doesn't help either, it compresses the ratings a bit further. So, for the period 2003-2005, some 30-40 Elo points yearly in Blitz using Ordo. Later, Rybka, CEGT, CCRL, IPON, FastGM, and other new things worth mentioning. But you are right, the progress after 2005 was faster for some time, at least for Shredder.

You are most likely right for 2003 to 2005 (at least for Shredder from 7,8,9). But no one looks for the reasons anymore. S7 had a bugfix to S7.04 with a huge elo jump. S8 was new and different but with no eloincrease to 7.04 at all and S9.1 was a bit more again. So, looking back ist difficult.

Btw, did you use S9 or S9.1? S9 had a bug and there was a fix released to 9.1 soon after release. But that is really a LONG time ago and I don't remember that 100%. Elowise it doesnt matter too much anyhow ...

And CEGT and IPON are using ORDO (afaik there is only one regulary updated rating list using something inferior (for whatever reason, no one knows) :-) ), so no compression looking at the given ratings.

Bye
Ingo

Post Reply