Computer chess progress over the last 20 years!

fierz · Post by **fierz** » Sun Mar 13, 2016 7:39 am

Thanks to all the comments and suggestions to the original post earlier; I (nearly) have the answer that I was looking for. Here is my list to start with, and the methods are explained below. I will be the first to admit that they are slightly sketchy, but much better than no methods at all

Before you start blasting this list, please note that I am trying to show the *approximate* progress by software alone, without hardware speed increase. Whether I really picked the strongest engine for a given year, or the second best with 10 or 20 elo less is irrelevant. Whether the methods for adjusting ratings between lists are +- 10 elo off is also irrelevant, and so are small changes due to time controls. Please only bicker if you think this is off by say 100 elo

The result is: over the past 18 years (ok, I didn't make it 20...), software progress alone - without hardware! - has given a nearly 750 elo rating increase (which is why I don't care about +-10 or 20 elo...). All ratings in the list have been adjusted to today's CCRL 40/40 ratings.

Year best engine elo
2015 Komodo 9.2 3267 (CCRL 40/40)
2014 Komodo 8 3214
2013 Houdini 3 3168
2012 Houdini 3 3168
2011 Houdini 1.5a 3129
2010 Rybka 4 3087
2009 Rybka 3 3051
2008 Rybka 3 3051
2007 Rybka 2.3.2a 2966
2006 Rybka 1.2 2899
2005 Shredder 8 2792 (SSDF Athlon 1200)
2004 Shredder 7 2758
2003 Shredder 7 2758
2002 Shredder 6 2706
2001 Deep Fritz 2703
2000 Fritz 6 2665 (SSDF K6-2 450)
1999 Fritz 5.32 2611
1998 Hiarcs 6 2533 (SSDF P200 MMX)
1997 rebel 8 2524

I would be happy to hear your thoughts on the list, and on the reasons for this large amount of progress. Can we figure out what were the main improvements in algorithms since then? Can someone suggest numbers, such as "Null move gives XXX elo"? How much is due to better testing and less buggy software (I'm thinking of Fruit, and it's influence on the community)? How much is due to open source and collaborative development and testing (Stockfish etc)?

best regards
Martin

--------------------------- Methods: ---------------------------------------

Ratings 2006-2015 are taken from the CCRL 40/40 (http://www.computerchess.org.uk/ccrl/40 ... t_all.html);
I found the top engine for each year using the wayback machine and took the last snapshot of each year (varying from December till about September).

Top engines by year for 2005-1997 are taken from Vincent Lejeune's post http://www.talkchess.com/forum/viewtopi ... 801#532801

Ratings 2005 -2001 are taken from the SSDF list (http://ssdf.bosjo.net/long.txt), testing on an Athlon 1200.
To adapt to the CCRL elo, I compared Rybka 1.2 on both of them. Rybka 1.2 is 7 elo higher on the SSDF list, so I subtract 7 elo from these ratings (statistically totally insignificant, I know!).

Ratings 1999+2000 are taken from the SSDF list, testing on a K6-2 450MHz. The slower machine is weaker, I compared Chess Tiger 15 and Shredder 7 ratings, which are available for both machines. Tiger 15 is 50 points lower on the slower machine, shredder 75, so I add the average of 62 points to these ratings, but keep in mind the -7, so I add an overall of 55.

Ratings 1997+1998 are taken from the SSDF list, testing on a P200 MMX, slower yet again. Fritz 5.32 is available on both P200 MMX and K6-2, and it's 57 points weaker on the slower machine, so I add these 57, plus the 55 of the K6-2 for 112.

Joost Buijs · Post by **Joost Buijs** » Sun Mar 13, 2016 8:37 am

If your claim of 750 Elo for software alone is true that would mean that Komodo on a single core P200MMX would reach ~3288 Elo.
Somewhat earlier you were talking about an additional 500 Elo for the improvements in hardware, that means that Komodo would reach ~3788 Elo on modern hardware, this is something I can't believe.
If this were true that would mean that Komodo on modern hardware will always score 100% in a match against a 3000 Elo player, this seems very unlikely to me.

You can't compare different isolated rating pools calculated with different algorithms and determined with different hardware by just using 1 or 2 engines as a calibration point, the error is simply too large.

I agree that the improvement due to software is substantial, but my feeling is that it is at max somewhere in the 450 Elo range.

Ozymandias · Post by **Ozymandias** » Sun Mar 13, 2016 10:47 am

Joost Buijs wrote:If your claim of 750 Elo for software alone is true that would mean that Komodo on a single core P200MMX would reach ~3288 Elo.
Somewhat earlier you were talking about an additional 500 Elo for the improvements in hardware, that means that Komodo would reach ~3788 Elo on modern hardware, this is something I can't believe.
If this were true that would mean that Komodo on modern hardware will always score 100% in a match against a 3000 Elo player, this seems very unlikely to me.

You can't compare different isolated rating pools calculated with different algorithms and determined with different hardware by just using 1 or 2 engines as a calibration point, the error is simply too large.

I agree that the improvement due to software is substantial, but my feeling is that it is at max somewhere in the 450 Elo range.

Hi Joost,

I agree with you on the point that mixing two different lists, using just one engine as an anchor point, isn't very reliable. Furthermore, SSDF played engines with their own books, and used very uneven HW. Last but not least, for the 2005 -2001 period, 32bits machines are used to run Rybka, which greatly benefited from 64bit.

All that being said, if we look at the last ten years, we can see a 350+ ELO gain, on the same HW and under the same conditions. It is my understanding that progress was even bigger, in the old days. 750 ELO? Could be, but hard proof would be welcomed.

fierz · Post by **fierz** » Sun Mar 13, 2016 11:31 am

Joost Buijs wrote:If your claim of 750 Elo for software alone is true that would mean that Komodo on a single core P200MMX would reach ~3288 Elo.
Somewhat earlier you were talking about an additional 500 Elo for the improvements in hardware, that means that Komodo would reach ~3788 Elo on modern hardware, this is something I can't believe.

That's not at all what I am saying. The ratings correspond to the current CCRL rating, not to P200MMX. I adjusted the ratings of P200MMX upwards to "correct" for that. The CCRL ratings seem to correspond practically 1:1 to the SSDF Athlon 1200 ratings, and are more than 100 points higher than the P200 ratings. So I would be claiming based on this post that Komodo 8 on a P200 MMX is around 3150. That does seem a bit high, and could have something to do with the 32->64 bit + fast intrinsics on new CPUs.

Also, the 500 Elo for hardware gain have nothing to do with this. The ratings quoted are already for fairly modern hardware. Please also note that all ratings quoted are for the single CPU version so as not to confuse the issue further. Part of the speedup in the last years has been multi-CPU, the single CPUs have not got that much faster any more.

I am very aware that I didn't take multiple engines to correct between lists, but that would be a +-20 elo difference at best which is not what I'm interested in.

In any case, it would be interesting to see a bullet match of Stockfish or Komodo against one of those older engines of around year 2K, to see whether the estimated rating differences here hold. The bullet match would address the objection of Bob who says that the old engines couldn't do some things that are possible now thanks to faster hardware. I don't have copies of those old programs, but perhaps someone here does?

BTW, the fastest I seem to be able to set in Arena is 1s/move, is it possible somehow to run at 0.1s/move?

Modern Times · Post by **Modern Times** » Sun Mar 13, 2016 1:28 pm

fierz wrote: The CCRL ratings seem to correspond practically 1:1 to the SSDF Athlon 1200 ratings,

That is correct - we initially calibrated our list to a selected basket of SSDF engines in 2006. However we since reduced our ratings by 100 because they were starting to look high to some people. You should be able to see that on wayback if you look at an old engine. I think we did it about half way through 2012 but I could be wrong on that.

fierz · Post by **fierz** » Sun Mar 13, 2016 10:07 pm

Dear Ray,

yes, I noticed the recalibration (and it was also pointed out in the comments to my original post on this topic). However, the SSDF rybka 1.2 elo is 2906, and your elo *at that time* (2006) was 2994, which is about those 100 elo difference.

In the most current CCRL, rybka is around 2900 elo, so actually - at least for Rybka, your current list agrees with the SSDF list as it is currently published. Perhaps they also recalibrated at some point?

best regards
Martin

fierz · Post by **fierz** » Sun Mar 13, 2016 10:12 pm

I made a graph out of that list for a better view

[/img]

It's pretty amazing how the progress seems to be quite relentless and steady. Perhaps there is a bit of a burst with Rybka, but apart from that the strength increase doesn't change a lot year over year.

Modern Times · Post by **Modern Times** » Sun Mar 13, 2016 11:26 pm

fierz wrote:Dear Ray,

yes, I noticed the recalibration (and it was also pointed out in the comments to my original post on this topic). However, the SSDF rybka 1.2 elo is 2906, and your elo *at that time* (2006) was 2994, which is about those 100 elo difference.

In the most current CCRL, rybka is around 2900 elo, so actually - at least for Rybka, your current list agrees with the SSDF list as it is currently published. Perhaps they also recalibrated at some point?

best regards
Martin

No I don't think SSDF have recalibrated. I checked the current SSDF list for the basket of engines we used (which did not include Rybka) with the values we have on file at 24/11/2006 for them, and they are quite close. For that same basket of engines, our 40/40 list has them approx 100 Elo below those values.

fierz · Post by **fierz** » Mon Mar 14, 2016 11:00 am

Thanks for the clarification Ray!

In any case, since I'm using your recalibrated = lowered values, and old SSDF values, then if anything, I'm underestimating the progress made. I would also think that the "750 elo gain in 18 years" has a +- 100 elo uncertainty on it anyways, which is also not that important to me. The question I'm wondering about is how much progress has been made in computer chess since Deep Blue "solved" the game (in public percption) by better algorithms alone, and it's pretty clear that by any measure this progress is enormous, and what I also find interesting is that it is quite steady.

I wonder if someone knowledgeable could suggest what the main drivers for this relentless progress are?

cheers
Martin

Joost Buijs · Post by **Joost Buijs** » Tue Mar 15, 2016 12:50 pm

fierz wrote: That's not at all what I am saying. The ratings correspond to the current CCRL rating, not to P200MMX. I adjusted the ratings of P200MMX upwards to "correct" for that. The CCRL ratings seem to correspond practically 1:1 to the SSDF Athlon 1200 ratings, and are more than 100 points higher than the P200 ratings. So I would be claiming based on this post that Komodo 8 on a P200 MMX is around 3150. That does seem a bit high, and could have something to do with the 32->64 bit + fast intrinsics on new CPUs.

Maybe I'm just dumb, but when I look at the SSDF rating list of May 1997 Hiarcs 6 has a rating of 2587 Elo on a P200 MMX.
You told that you added 112 Elo to the P200 ratings to compensate for the difference in speed with the Athlon 1200 you use as baseline, so 2587 + 112 = 2699 Elo.
You take Rebel 8 as the strongest engine in 1997 with an (adjusted?) rating of 2524 on a P200.
In the SSDF rating list of September 1997 there is Rebel 8 at 2519 Elo on a P200, so it looks as if you didn't adjust the early ratings for the difference in hardware at all.
In 1997 Hiarcs 6 was clearly stronger then Rebel 8 so I wonder why you didn't take that engine for comparison.

Also the difference of only 112 (119) Elo for the > 6 speed difference between P200 and Athlon 1200 seems way too low.

I think the data is so unreliable that you can get everything out of it by looking at it in different ways.

Computer chess progress over the last 20 years!

Computer chess progress over the last 20 years!

Re: Computer chess progress over the last 20 years!

Re: Computer chess progress over the last 20 years!

Re: Computer chess progress over the last 20 years!

Re: Computer chess progress over the last 20 years!

Re: Computer chess progress over the last 20 years!

Re: Computer chess progress over the last 20 years!

Re: Computer chess progress over the last 20 years!

Re: Computer chess progress over the last 20 years!

Re: Computer chess progress over the last 20 years!