Hardware vs Software

Don · Post by **Don** » Wed Jan 14, 2009 12:27 am

Dirt wrote:
Don wrote:I'm sure the only 10 year time period that THEY would be interested in would have to include Rybka. I think most of their confidence is pinned on this single program and also their only hope, as slight as it is.

To me it's more reasonable to include a more representative 10 year period, not one that saw the introduction of an uncharacteristically strong program. For them, a bad 10 year period might just barely include the introduction of brand new hardware from Intel.

So the precise time period you settle on does matter - however I think 10 years is long enough to smooth out the bumps enough to not matter.
I'm not saying, and I think Uri has already disclaimed, that software improvements have always outpaced those of hardware. That this may be true mostly because of Rybka I don't deny. I do think that many people outside of the computer chess community underestimate the magnitude of the software improvements that have taken place over the years, which makes me a bit defensive on behalf of the software authors.

Yes, I agree completely on this. In the minds of many outsiders chess is a "brute force" hardware thing.

michiguel · Post by **michiguel** » Wed Jan 14, 2009 12:48 am

Don wrote:
Dirt wrote:
Don wrote:I'm sure the only 10 year time period that THEY would be interested in would have to include Rybka. I think most of their confidence is pinned on this single program and also their only hope, as slight as it is.

To me it's more reasonable to include a more representative 10 year period, not one that saw the introduction of an uncharacteristically strong program. For them, a bad 10 year period might just barely include the introduction of brand new hardware from Intel.

So the precise time period you settle on does matter - however I think 10 years is long enough to smooth out the bumps enough to not matter.
I'm not saying, and I think Uri has already disclaimed, that software improvements have always outpaced those of hardware. That this may be true mostly because of Rybka I don't deny. I do think that many people outside of the computer chess community underestimate the magnitude of the software improvements that have taken place over the years, which makes me a bit defensive on behalf of the software authors.
Yes, I agree completely on this. In the minds of many outsiders chess is a "brute force" hardware thing.

Rybka + today's hardware vs. Fritz 5 (top machine Dec 1998) + Today's hardware means an Elo difference of 622 points, to come back to something Uri mentioned or suggested.

That is 622 Elo points in exactly 10 years purely due to software. Can we say that hardware is granting the same amount?
If doubling the speed is 75 Elo points, 600 Elo points means 2^8 = 256 faster hardware needed to account for this. If Bob is right about 1:200 (which means doubling the speed every 1 year 3-4 months, beating moore's law) hardware is close to that. We can discuss 50 or 100 elo points more or less here and there, but the conclusion will be that the improvement due to software and hardware are equal or very close, going hand to hand.

Miguel

Don · Post by **Don** » Wed Jan 14, 2009 1:21 am

michiguel wrote:
Don wrote:
Dirt wrote:
Don wrote:I'm sure the only 10 year time period that THEY would be interested in would have to include Rybka. I think most of their confidence is pinned on this single program and also their only hope, as slight as it is.

To me it's more reasonable to include a more representative 10 year period, not one that saw the introduction of an uncharacteristically strong program. For them, a bad 10 year period might just barely include the introduction of brand new hardware from Intel.

So the precise time period you settle on does matter - however I think 10 years is long enough to smooth out the bumps enough to not matter.
I'm not saying, and I think Uri has already disclaimed, that software improvements have always outpaced those of hardware. That this may be true mostly because of Rybka I don't deny. I do think that many people outside of the computer chess community underestimate the magnitude of the software improvements that have taken place over the years, which makes me a bit defensive on behalf of the software authors.
Yes, I agree completely on this. In the minds of many outsiders chess is a "brute force" hardware thing.
Rybka + today's hardware vs. Fritz 5 (top machine Dec 1998) + Today's hardware means an Elo difference of 622 points, to come back to something Uri mentioned or suggested.

That is 622 Elo points in exactly 10 years purely due to software. Can we say that hardware is granting the same amount?
If doubling the speed is 75 Elo points, 600 Elo points means 2^8 = 256 faster hardware needed to account for this. If Bob is right about 1:200 (which means doubling the speed every 1 year 3-4 months, beating moore's law) hardware is close to that. We can discuss 50 or 100 elo points more or less here and there, but the conclusion will be that the improvement due to software and hardware are equal or very close, going hand to hand.

Miguel

You guys always find a way to give Rybka a big advantage. I really don't believe a program developed 10 years ago should be expected to run well on hardware that will not exist for 10 years. Even if you just consider that it was compiled on a 10 year old compiler with the wrong optimizations for this platform and that it wasn't designed for 64 bit computers because there were not common enough, that's a pretty big hit. (I suppose you could consider compiler quality a software issue but not compiler based optimizations.)

I seriously doubt Fritz was tuned to run at a time control of 1 move every 10 hours either. You might claim this is a software issue but it's a tuning issue. Even with these factors I have no doubt that Rybka would still be much stronger.

And despite the math you are using I do not see Rybka 1 second beating Fritz at 200 seconds. I would definitely have to see this.

bob · Post by **bob** » Wed Jan 14, 2009 2:01 am

Dirt wrote:
bob wrote:First let's settle on a 10 year hardware period. The q6600 is two years old. If you want to use that as a basis, we need to return to early 1997 to choose the older hardware. The Pentium 2 (Klamath) came out around the middle of 1997, which probably means the best was the pentium pro 200. I suspect we are _still_ talking about 200:1

This is not about simple clock frequency improvements, more modern architectures are faster for other reasons such as better speculative execution, more pipelines, register renaming, etc...
Correct me if I'm wrong, but in moving to a time handicap you seem to be ignoring the parallel search inefficiency we were both just explaining to Louis Zulli. Shouldn't that be taken into account?

I don't see why. I used the same parallel search 10 years ago that I use today, the overhead has not changed.

The main point both Don and I have _tried_ to make is that given a certain class of hardware, one is willing or able to do things that are not possible on slower hardware. In Cray Blitz, we specifically made use of vectors, and that gave us some features we could use that would be too costly in a normal scalar architecture. So there are three kinds of improvements over the past 10 years.

1. pure hardware

2. pure software

3. hybrid improvements where improved hardware gave us the ability to do things in software we could not do with previous generations of hardware due to speed issues...

bob · Post by **bob** » Wed Jan 14, 2009 2:03 am

Dirt wrote:
Don wrote:
Dirt wrote:
bob wrote:First let's settle on a 10 year hardware period. The q6600 is two years old. If you want to use that as a basis, we need to return to early 1997 to choose the older hardware. The Pentium 2 (Klamath) came out around the middle of 1997, which probably means the best was the Pentium pro 200. I suspect we are _still_ talking about 200:1

This is not about simple clock frequency improvements, more modern architectures are faster for other reasons such as better speculative execution, more pipelines, register renaming, etc...
Correct me if I'm wrong, but in moving to a time handicap you seem to be ignoring the parallel search inefficiency we were both just explaining to Louis Zulli. Shouldn't that be taken into account?
None of this will matter unless it's really a close match - so I would be prepared to simple test single processor Rybka vs whatever and see what happens. If Rybka loses we have a "beta cut-off" and can stop, otherwise we must test something a little more fair and raise alpha.
If the parallel search overhead means that the ratio should really be, say, 150:1 then I don't think Rybka losing really proves your point. If there should be such a reduction, and how large it should be, is a question I am asking.

I do not see where this is coming from. We had parallel search in 1997 already. So the overhead is already factored in.

Don · Post by **Don** » Wed Jan 14, 2009 2:14 am

Dirt wrote:
Don wrote:
Dirt wrote:
bob wrote:First let's settle on a 10 year hardware period. The q6600 is two years old. If you want to use that as a basis, we need to return to early 1997 to choose the older hardware. The Pentium 2 (Klamath) came out around the middle of 1997, which probably means the best was the Pentium pro 200. I suspect we are _still_ talking about 200:1

This is not about simple clock frequency improvements, more modern architectures are faster for other reasons such as better speculative execution, more pipelines, register renaming, etc...
Correct me if I'm wrong, but in moving to a time handicap you seem to be ignoring the parallel search inefficiency we were both just explaining to Louis Zulli. Shouldn't that be taken into account?
None of this will matter unless it's really a close match - so I would be prepared to simple test single processor Rybka vs whatever and see what happens. If Rybka loses we have a "beta cut-off" and can stop, otherwise we must test something a little more fair and raise alpha.
If the parallel search overhead means that the ratio should really be, say, 150:1 then I don't think Rybka losing really proves your point. If there should be such a reduction, and how large it should be, is a question I am asking.

So if Rybka loses with say a 32 to 1 handicap you are saying that we should give her even less time to see if she still loses?

michiguel · Post by **michiguel** » Wed Jan 14, 2009 4:05 am

Don wrote:
michiguel wrote:
Don wrote:
Dirt wrote:
Don wrote:I'm sure the only 10 year time period that THEY would be interested in would have to include Rybka. I think most of their confidence is pinned on this single program and also their only hope, as slight as it is.

To me it's more reasonable to include a more representative 10 year period, not one that saw the introduction of an uncharacteristically strong program. For them, a bad 10 year period might just barely include the introduction of brand new hardware from Intel.

So the precise time period you settle on does matter - however I think 10 years is long enough to smooth out the bumps enough to not matter.
I'm not saying, and I think Uri has already disclaimed, that software improvements have always outpaced those of hardware. That this may be true mostly because of Rybka I don't deny. I do think that many people outside of the computer chess community underestimate the magnitude of the software improvements that have taken place over the years, which makes me a bit defensive on behalf of the software authors.
Yes, I agree completely on this. In the minds of many outsiders chess is a "brute force" hardware thing.
Rybka + today's hardware vs. Fritz 5 (top machine Dec 1998) + Today's hardware means an Elo difference of 622 points, to come back to something Uri mentioned or suggested.

That is 622 Elo points in exactly 10 years purely due to software. Can we say that hardware is granting the same amount?
If doubling the speed is 75 Elo points, 600 Elo points means 2^8 = 256 faster hardware needed to account for this. If Bob is right about 1:200 (which means doubling the speed every 1 year 3-4 months, beating moore's law) hardware is close to that. We can discuss 50 or 100 elo points more or less here and there, but the conclusion will be that the improvement due to software and hardware are equal or very close, going hand to hand.

Miguel
You guys always find a way to give Rybka a big advantage. I really don't believe a program developed 10 years ago should be expected to run well on hardware that will not exist for 10 years. Even if you just consider that it was compiled on a 10 year old compiler with the wrong optimizations for this platform and that it wasn't designed for 64 bit computers because there were not common enough, that's a pretty big hit. (I suppose you could consider compiler quality a software issue but not compiler based optimizations.)

I seriously doubt Fritz was tuned to run at a time control of 1 move every 10 hours either. You might claim this is a software issue but it's a tuning issue. Even with these factors I have no doubt that Rybka would still be much stronger.

Tuning, optimizations, etc. are not hardware so I guess I can include it in "software". Let's suppose that we allow Fritz to recompile with optimizations. Let's say it becomes twice faster => 75 Elo. Then the difference is 622-75 = 548 points. My conclusion does not change. Can hardware alone in 10 years justify ~550 Elo points. Yes, maybe around that number, but not much higher than that.

I gave you before a 1:200 speed up by hardware, but this is way above the 1:32 predicted by the Moore's law. By Moore's Law, 550 points is justified in about 15 years, not ten.

Moore's law can account for 375 Elo points in 10 years. Well, that is about the difference between Fritz 1999 and Fritz 2009 in current hardware. We do not need Rybka and we could use Naum.

I can't see that the numbers we have in hand back that hardware improvement was more important than software. In fact, the opposite may make more sense, IMHO. At best, both contributed it very much, in the same ball park.

Miguel

And despite the math you are using I do not see Rybka 1 second beating Fritz at 200 seconds. I would definitely have to see this.

bob · Post by **bob** » Wed Jan 14, 2009 4:06 am

Don wrote:
Dirt wrote:
Don wrote:
Dirt wrote:
bob wrote:First let's settle on a 10 year hardware period. The q6600 is two years old. If you want to use that as a basis, we need to return to early 1997 to choose the older hardware. The Pentium 2 (Klamath) came out around the middle of 1997, which probably means the best was the Pentium pro 200. I suspect we are _still_ talking about 200:1

This is not about simple clock frequency improvements, more modern architectures are faster for other reasons such as better speculative execution, more pipelines, register renaming, etc...
Correct me if I'm wrong, but in moving to a time handicap you seem to be ignoring the parallel search inefficiency we were both just explaining to Louis Zulli. Shouldn't that be taken into account?
None of this will matter unless it's really a close match - so I would be prepared to simple test single processor Rybka vs whatever and see what happens. If Rybka loses we have a "beta cut-off" and can stop, otherwise we must test something a little more fair and raise alpha.
If the parallel search overhead means that the ratio should really be, say, 150:1 then I don't think Rybka losing really proves your point. If there should be such a reduction, and how large it should be, is a question I am asking.
So if Rybka loses with say a 32 to 1 handicap you are saying that we should give her even less time to see if she still loses?

This is going around in circles. It is easy to quantify the hardware. I'd suggest taking the best of today, the intel I7 (core-3) and the best of late 1998. Limit it to a single chip for simplicity, but no limit on how many cores per chip. I believe this is going to be about a 200:1 time handicap to emulate the difference between the 4-core core-3 from intel and the best of 1998, which was the PII/300 processor.

For comparison, crafty on a quad-core I7 runs at 20M nodes per second, while on the single-cpu PII/300 was running at not quite 100K nodes per second. A clean and simple factor of 200x faster hardware over that period (and again, those quoting moore's law are quoting it incorrectly, it does _not_ say processor speed doubles every 2 years, it says _density_ doubles every 2 years, which is a different thing entirely). Clock speeds have gone steadily upward, but internal processor design has improved even more. Just compare a 2.0ghz core2 cpu against a 4.0ghz older processor to see what I mean.)

so that fixes the speed differential over the past ten years with high accuracy. Forget the discussions about 50:1 or the stuff about 200:1 being too high. As Bill Clinton would say, "It is what it is." And what it is is 200x.

That is almost 8 doublings, which is in the range of +600 Elo. That is going to be a great "equalizer" in this comparison. 200x is a daunting advantage to overcome. And if someone really thinks software has produced that kind of improvement, we need to test it and put it to rest once and for all...

I will accept that a program today running on 4 cores will see some overhead due to the parallel search. But I don't think it is worth arguing about whether we should scale back the speed because of the overhead. That is simply a software issue as well, as it is theoretically possible to have very little overhead. If the software can't quite use the computing power available, that is a software problem, not a hardware limit.

I have some Crafty versions that should be right for that time frame. Crafty 15.0 was the first parallel search version. I suspect something in the 16.x versions or possibly 17.x versions was used at the end of 1998. Crafty ran on a quad pentium pro early in 1998 whe
n version 15.0 was done...

BubbaTough · Post by **BubbaTough** » Wed Jan 14, 2009 4:20 am

bob wrote: I will accept that a program today running on 4 cores will see some overhead due to the parallel search. But I don't think it is worth arguing about whether we should scale back the speed because of the overhead. That is simply a software issue as well, as it is theoretically possible to have very little overhead. If the software can't quite use the computing power available, that is a software problem, not a hardware limit.

Hmmm. I think there is a big difference between 50x speedup on 4 processors, and 200x on 1. Blaming software for not overcoming alpha-beta inefficiencies in utilizing multiple processors efficiently seems tangential. The fact is if you take an old program and put it on new hardware, it does not get 200x faster because it also cannot take advantage of the extra processors perfectly.

-Sam

michiguel · Post by **michiguel** » Wed Jan 14, 2009 4:31 am

bob wrote:
Don wrote:
Dirt wrote:
Don wrote:
Dirt wrote:
bob wrote:First let's settle on a 10 year hardware period. The q6600 is two years old. If you want to use that as a basis, we need to return to early 1997 to choose the older hardware. The Pentium 2 (Klamath) came out around the middle of 1997, which probably means the best was the Pentium pro 200. I suspect we are _still_ talking about 200:1

This is not about simple clock frequency improvements, more modern architectures are faster for other reasons such as better speculative execution, more pipelines, register renaming, etc...
Correct me if I'm wrong, but in moving to a time handicap you seem to be ignoring the parallel search inefficiency we were both just explaining to Louis Zulli. Shouldn't that be taken into account?
None of this will matter unless it's really a close match - so I would be prepared to simple test single processor Rybka vs whatever and see what happens. If Rybka loses we have a "beta cut-off" and can stop, otherwise we must test something a little more fair and raise alpha.
If the parallel search overhead means that the ratio should really be, say, 150:1 then I don't think Rybka losing really proves your point. If there should be such a reduction, and how large it should be, is a question I am asking.
So if Rybka loses with say a 32 to 1 handicap you are saying that we should give her even less time to see if she still loses?
This is going around in circles. It is easy to quantify the hardware. I'd suggest taking the best of today, the intel I7 (core-3) and the best of late 1998. Limit it to a single chip for simplicity, but no limit on how many cores per chip. I believe this is going to be about a 200:1 time handicap to emulate the difference between the 4-core core-3 from intel and the best of 1998, which was the PII/300 processor.

For comparison, crafty on a quad-core I7 runs at 20M nodes per second, while on the single-cpu PII/300 was running at not quite 100K nodes per second. A clean and simple factor of 200x faster hardware over that period (and again, those quoting moore's law are quoting it incorrectly, it does _not_ say processor speed doubles every 2 years, it says _density_ doubles every 2 years, which is a different thing entirely). Clock speeds have gone steadily upward, but internal processor design has improved even more. Just compare a 2.0ghz core2 cpu against a 4.0ghz older processor to see what I mean.)

so that fixes the speed differential over the past ten years with high accuracy. Forget the discussions about 50:1 or the stuff about 200:1 being too high. As Bill Clinton would say, "It is what it is." And what it is is 200x.

That is almost 8 doublings, which is in the range of +600 Elo. That is going to be a great "equalizer" in this comparison. 200x is a daunting advantage to overcome. And if someone really thinks software has produced that kind of improvement, we need to test it and put it to rest once and for all...

I will accept that a program today running on 4 cores will see some overhead due to the parallel search. But I don't think it is worth arguing about whether we should scale back the speed because of the overhead. That is simply a software issue as well, as it is theoretically possible to have very little overhead. If the software can't quite use the computing power available, that is a software problem, not a hardware limit.

Then you have to accept that Fritz 5 is 622 Elo points below Rybka in current hardware. That is a bit more than the 600 points you estimate harwdare provided in 10 years.

Miguel

I have some Crafty versions that should be right for that time frame. Crafty 15.0 was the first parallel search version. I suspect something in the 16.x versions or possibly 17.x versions was used at the end of 1998. Crafty ran on a quad pentium pro early in 1998 whe
n version 15.0 was done...

Hardware vs Software

Re: Hardware vs Software

Re: Hardware vs Software

Re: Hardware vs Software

Re: Hardware vs Software

Re: Hardware vs Software

Re: Hardware vs Software

Re: Hardware vs Software

Re: Hardware vs Software

Re: Hardware vs Software

Re: Hardware vs Software