Hardware vs Software

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw, Ras, hgm, chrisw, Rebel, Ras

bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Hardware vs Software

Post by bob »

Mike S. wrote:Ok; of course I would expect Rybka to win this, too. But I thought this discussion is about new software being less superior on old hardware, than on new hardware. But if nobody expects that Rybka is relatively worse on old hardware anyway, than this point is void and I don't see the sense of it.

But thanks for the remark; now I can save time and efforts by not doing a useless test.

If anyone thinks new software is worse on old hardware: It can be tested. New and old hard- and software is here, it still works, and it can be tested. That would provide facts instead of "talking"...
The discussion is about increase in strength over the past (in this case 10) N years. Part of it is software, part is hardware. To measure hardware improvement, you have to keep software fixed throughout the test, and vice-versa to measure software improvement...

You need some sort of static opponent, and play a 2000 program on 2000 hardware against this opponent. Then play the 2000 program on 2009 hardware against the same opponent. Then take a 2009 program on 2000 hardware against same opponent, and 2009 program on 2009 hardware against same opponent. Now you will _know_ how much of the gain was hardware vs software. Only thing you need is the top 2000 program and that same program in 2009...

Rybka is NFG for this test as it has not been around long enough, and its roots have never been clearly defined...
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: Hardware vs Software

Post by Don »

Dirt wrote:
Don wrote:This match would not be close.

The reason I say Fritz vs Fritz is that it is a program that has been under active development the whole time and reflects most smoothly the state of the art. Rybka didn't exist back then and in AI terminology it's a bit of a local optima. A bump in the landscape which is not smooth.

If the best program is used (Rybka) then the best program of 10 years ago should also be used, whatever it is. Perhaps it is Fritz? I don't really know but I don't think it matters as even Rybka would lose with those odds against it.
The reason I say not Fritz is that it is not the best representative of chess software today; that is Rybka. You can't measure software advance by comparing a program that is not the most advanced. Restricting it to just one program, unless it was the best at both the start and end points, is fundamentally wrong.

Ten years ago the best program might have been Hiarcs 7.0, or possibly some version of Shredder or Fritz.
I would like to see such a match materialze even though I doubt it will.

It's pretty obvious that for me, it's better if Rybka is used whether I win or lose. Here are the scenarios:

1. If I win against Rybka the results are the most convincing.
2. If I win against Fritz it's discounted.
3. If I lose against Rybka at least I can make the argument that Rybka isn't a typical program. A Rybka doesn't come along very often.
4. If I lose against Fritz, I have been disgraced!

So all the downside for me is if we don't use Rybka.

For you, it is more complicated. You will probably lose no matter what but you clearly have the best chance with Rybka, so it's understandable that you would insist on using Rybka.

Of course if you really had any confidence in your assertion that "most" of the improvement was due to software, then you would want to hammer this point home by using Fritz. But it's clear that you cannot do that and that I would not let you do that anyway due to reason 1 and 2.

Otherwise, I can save face with point 3 above and make the point that Rybka is freakishly strong and is an anomaly. After all, what program of 10 years ago was 200 ELO stronger than all the others?

It's easy to argue that Rybka doesn't fit in with the landscape but it will be much easier to make this argument in 5 years (hindsight is 20/20) if/when other programs have caught up and there has not been substantial improvement.

Right now it's awkward because by chance we happen to be just a few weeks in front of a major breakthrough improvement and human nature being as it is we don't see the forest, just this big freakish tree in front of us and since we are not looking around we think it represent all the trees in the forest.
Dirt
Posts: 2851
Joined: Wed Mar 08, 2006 10:01 pm
Location: Irvine, CA, USA

Re: Hardware vs Software

Post by Dirt »

bob wrote:Rybka is NFG for this test as it has not been around long enough, and its roots have never been clearly defined...
Comparing a top program from the past, like Fritz 5.32, with a current program that is well down the list, like the current Fritz, is completely useless. That wouldn't show how much software has advanced, just Fritz. The only correct comparison is with Fritz or Hiarcs from then with Rybka 3.

While I see no reason to do it, if you want to use the same program it should be equally far down both the old and new rating lists. A candidate would be Shredder 2.0 from 1999 with the current Shredder, although I think it's much too highly rated in the 1999 list to be a fair test.

January 1999 SSDF rating list
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: Hardware vs Software

Post by Don »

Dirt wrote:
bob wrote:Rybka is NFG for this test as it has not been around long enough, and its roots have never been clearly defined...
Comparing a top program from the past, like Fritz 5.32, with a current program that is well down the list, like the current Fritz, is completely useless. That wouldn't show how much software has advanced, just Fritz. The only correct comparison is with Fritz or Hiarcs from then with Rybka 3.

While I see no reason to do it, if you want to use the same program it should be equally far down both the old and new rating lists. A candidate would be Shredder 2.0 from 1999 with the current Shredder, although I think it's much too highly rated in the 1999 list to be a fair test.

January 1999 SSDF rating list
It's ok with me to use Rybka as I believe this is good enough to make the point without any dispute.

I don't really remember what was the most common cutting edge hardware. I believe we should use the best hardware of the time that would have been used to evaluate the program on the rating lists. I guess today that would be 4 core machines but I don't know what it was back then. It may be that someone actually has that hardware still and we could rig some test.

- Don
Uri Blass
Posts: 10661
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: Hardware vs Software

Post by Uri Blass »

Don wrote:
Dirt wrote:
bob wrote:Rybka is NFG for this test as it has not been around long enough, and its roots have never been clearly defined...
Comparing a top program from the past, like Fritz 5.32, with a current program that is well down the list, like the current Fritz, is completely useless. That wouldn't show how much software has advanced, just Fritz. The only correct comparison is with Fritz or Hiarcs from then with Rybka 3.

While I see no reason to do it, if you want to use the same program it should be equally far down both the old and new rating lists. A candidate would be Shredder 2.0 from 1999 with the current Shredder, although I think it's much too highly rated in the 1999 list to be a fair test.

January 1999 SSDF rating list
It's ok with me to use Rybka as I believe this is good enough to make the point without any dispute.

I don't really remember what was the most common cutting edge hardware. I believe we should use the best hardware of the time that would have been used to evaluate the program on the rating lists. I guess today that would be 4 core machines but I don't know what it was back then. It may be that someone actually has that hardware still and we could rig some test.

- Don
It seems that rybka is using significant amount of memory for its internal arrays(something like 72 mbytes) so rybka needs at least computer with 128 mbytes to run.

Fortunately 128 mbytes were available even 10 years ago but it means that rybka probably cannot use more than 32 mbytes hash(it is not very important because the difference between 32 mbytes hash and bigger hash under 128 mbytes is not a big difference).

10 years ago the ssdf used P200 and they upgraded it to K6-450 in the middle of 1999.

I can agree to ssdf hardware so it means P200 against quad.
Now the problem is that the commercial versions of Fritz or Hiarcs of the beginning of 1999 did not support SMP.

Uri

Edit:The best software of 1999 on today's hardware is unknown and I think it may be good to do some tests to decide about the opponent of rybka.

SSDF gives only the best for old hardware and it is clearly possible that Junior5 or another program that is not the leader on old hardware is better in new hardware.
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: Hardware vs Software

Post by Don »

Uri Blass wrote:
Don wrote:
Dirt wrote:
bob wrote:Rybka is NFG for this test as it has not been around long enough, and its roots have never been clearly defined...
Comparing a top program from the past, like Fritz 5.32, with a current program that is well down the list, like the current Fritz, is completely useless. That wouldn't show how much software has advanced, just Fritz. The only correct comparison is with Fritz or Hiarcs from then with Rybka 3.

While I see no reason to do it, if you want to use the same program it should be equally far down both the old and new rating lists. A candidate would be Shredder 2.0 from 1999 with the current Shredder, although I think it's much too highly rated in the 1999 list to be a fair test.

January 1999 SSDF rating list
It's ok with me to use Rybka as I believe this is good enough to make the point without any dispute.

I don't really remember what was the most common cutting edge hardware. I believe we should use the best hardware of the time that would have been used to evaluate the program on the rating lists. I guess today that would be 4 core machines but I don't know what it was back then. It may be that someone actually has that hardware still and we could rig some test.

- Don
It seems that rybka is using significant amount of memory for its internal arrays(something like 72 mbytes) so rybka needs at least computer with 128 mbytes to run.

Fortunately 128 mbytes were available even 10 years ago but it means that rybka probably cannot use more than 32 mbytes hash(it is not very important because the difference between 32 mbytes hash and bigger hash under 128 mbytes is not a big difference).

10 years ago the ssdf used P200 and they upgraded it to K6-450 in the middle of 1999.

I can agree to ssdf hardware so it means P200 against quad.
Now the problem is that the commercial versions of Fritz or Hiarcs of the beginning of 1999 did not support SMP.

Uri
I remember several programs running on dual core machines at the tournament in Padderborn - I forget which year. Are you sure nothing supported this 10 years ago?

If not, that is a problem. When it comes to software/hardware synergy you can make the argument in either direction. I could say that it's not fair because hardware isn't being utilized (and I claim most of the advance is hardware based) but you could claim that writing a parallel program is a software advance. That argument is wrong of course, because parallel programs have been around a very long time.

So this might come down to an odds match - can an older program that is crippled beat a modern uncommonly strong (and I believe not very representative) program? I think with this kind of unfair handicap this could be a relatively close match.

It would be real easy to run this match remotely via a shell, but Rybka is a Windows program and windows is not as flexible about this kind of stuff, so it would have to be conducted on a local setup somehow.
Uri Blass
Posts: 10661
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: Hardware vs Software

Post by Uri Blass »

Don wrote:
Uri Blass wrote:
Don wrote:
Dirt wrote:
bob wrote:Rybka is NFG for this test as it has not been around long enough, and its roots have never been clearly defined...
Comparing a top program from the past, like Fritz 5.32, with a current program that is well down the list, like the current Fritz, is completely useless. That wouldn't show how much software has advanced, just Fritz. The only correct comparison is with Fritz or Hiarcs from then with Rybka 3.

While I see no reason to do it, if you want to use the same program it should be equally far down both the old and new rating lists. A candidate would be Shredder 2.0 from 1999 with the current Shredder, although I think it's much too highly rated in the 1999 list to be a fair test.

January 1999 SSDF rating list
It's ok with me to use Rybka as I believe this is good enough to make the point without any dispute.

I don't really remember what was the most common cutting edge hardware. I believe we should use the best hardware of the time that would have been used to evaluate the program on the rating lists. I guess today that would be 4 core machines but I don't know what it was back then. It may be that someone actually has that hardware still and we could rig some test.

- Don
It seems that rybka is using significant amount of memory for its internal arrays(something like 72 mbytes) so rybka needs at least computer with 128 mbytes to run.

Fortunately 128 mbytes were available even 10 years ago but it means that rybka probably cannot use more than 32 mbytes hash(it is not very important because the difference between 32 mbytes hash and bigger hash under 128 mbytes is not a big difference).

10 years ago the ssdf used P200 and they upgraded it to K6-450 in the middle of 1999.

I can agree to ssdf hardware so it means P200 against quad.
Now the problem is that the commercial versions of Fritz or Hiarcs of the beginning of 1999 did not support SMP.

Uri
I remember several programs running on dual core machines at the tournament in Padderborn - I forget which year. Are you sure nothing supported this 10 years ago?

If not, that is a problem. When it comes to software/hardware synergy you can make the argument in either direction. I could say that it's not fair because hardware isn't being utilized (and I claim most of the advance is hardware based) but you could claim that writing a parallel program is a software advance. That argument is wrong of course, because parallel programs have been around a very long time.

So this might come down to an odds match - can an older program that is crippled beat a modern uncommonly strong (and I believe not very representative) program? I think with this kind of unfair handicap this could be a relatively close match.

It would be real easy to run this match remotely via a shell, but Rybka is a Windows program and windows is not as flexible about this kind of stuff, so it would have to be conducted on a local setup somehow.
I also remember programs that used smp in tournaments but
I am afraid that not the commercial programs of that time.

It is possible to give time handicap to emulate the smp factor and in this case we need to use ponder off.

how much time advantage we have of Q6600 relative to P200?

SSDF had the following hardwares

P200
K6-450
A1200
Q6600

Uri
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: Hardware vs Software

Post by Don »

Uri Blass wrote:
Don wrote:
Uri Blass wrote:
Don wrote:
Dirt wrote:
bob wrote:Rybka is NFG for this test as it has not been around long enough, and its roots have never been clearly defined...
Comparing a top program from the past, like Fritz 5.32, with a current program that is well down the list, like the current Fritz, is completely useless. That wouldn't show how much software has advanced, just Fritz. The only correct comparison is with Fritz or Hiarcs from then with Rybka 3.

While I see no reason to do it, if you want to use the same program it should be equally far down both the old and new rating lists. A candidate would be Shredder 2.0 from 1999 with the current Shredder, although I think it's much too highly rated in the 1999 list to be a fair test.

January 1999 SSDF rating list
It's ok with me to use Rybka as I believe this is good enough to make the point without any dispute.

I don't really remember what was the most common cutting edge hardware. I believe we should use the best hardware of the time that would have been used to evaluate the program on the rating lists. I guess today that would be 4 core machines but I don't know what it was back then. It may be that someone actually has that hardware still and we could rig some test.

- Don
It seems that rybka is using significant amount of memory for its internal arrays(something like 72 mbytes) so rybka needs at least computer with 128 mbytes to run.

Fortunately 128 mbytes were available even 10 years ago but it means that rybka probably cannot use more than 32 mbytes hash(it is not very important because the difference between 32 mbytes hash and bigger hash under 128 mbytes is not a big difference).

10 years ago the ssdf used P200 and they upgraded it to K6-450 in the middle of 1999.

I can agree to ssdf hardware so it means P200 against quad.
Now the problem is that the commercial versions of Fritz or Hiarcs of the beginning of 1999 did not support SMP.

Uri
I remember several programs running on dual core machines at the tournament in Padderborn - I forget which year. Are you sure nothing supported this 10 years ago?

If not, that is a problem. When it comes to software/hardware synergy you can make the argument in either direction. I could say that it's not fair because hardware isn't being utilized (and I claim most of the advance is hardware based) but you could claim that writing a parallel program is a software advance. That argument is wrong of course, because parallel programs have been around a very long time.

So this might come down to an odds match - can an older program that is crippled beat a modern uncommonly strong (and I believe not very representative) program? I think with this kind of unfair handicap this could be a relatively close match.

It would be real easy to run this match remotely via a shell, but Rybka is a Windows program and windows is not as flexible about this kind of stuff, so it would have to be conducted on a local setup somehow.
I also remember programs that used smp in tournaments but
I am afraid that not the commercial programs of that time.

It is possible to give time handicap to emulate the smp factor and in this case we need to use ponder off.

how much time advantage we have of Q6600 relative to P200?

SSDF had the following hardwares

P200
K6-450
A1200
Q6600

Uri
For that matter, we could turn ponder off and I can see just how much handicap Rybka can tolerate - maybe the result would be obvious enough to give us an answer and we could extrapolate to estimate what might happen in an actual match.

Even though I have linux, I have my own tester which is pretty flexible and allows handicaps at any time control any depth, whatever and I have the 64 bit Rybka that runs on linux with the hack published in the Rybka forum. I'm not sure I have a 10 year old program however and we would need to agree on how much hash to give Rybka. Of course anyone else would be free to try to duplicate this test. Is there a 10 year public domain program that represents the very best of 10 years ago?

What I would do is just run matches - when one program gets ahead of another I would modify the time control slightly until I reach some kind of equilibrium. I think anyone could independently verify this result on most reasonable hardware and we would have an "H factor" - a multiplier that approximately gives us the handicap required in time.

I test with about 8000 shallow depth openings from a large games collection - each player plays both sides of each opening for a given opponent and we don't go any deeper than a few ply. And the openings are not tuned to any program. The opening had to occur a few times to make it into my shallow book.

- Don
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Hardware vs Software

Post by bob »

Dirt wrote:
bob wrote:Rybka is NFG for this test as it has not been around long enough, and its roots have never been clearly defined...
Comparing a top program from the past, like Fritz 5.32, with a current program that is well down the list, like the current Fritz, is completely useless. That wouldn't show how much software has advanced, just Fritz. The only correct comparison is with Fritz or Hiarcs from then with Rybka 3.

While I see no reason to do it, if you want to use the same program it should be equally far down both the old and new rating lists. A candidate would be Shredder 2.0 from 1999 with the current Shredder, although I think it's much too highly rated in the 1999 list to be a fair test.

January 1999 SSDF rating list
here is a less accurate test, but one which would be enlightening.

First, you need best hardware today. Probably a good 8-core box although 16 and 32 are available. Going to 16 or 32 might distort results since a 2000 program might not be able to use that many. Then you need best hardware of 2000. There were certainly 4 processor boxes as I had one in 1996.

Now you pick the best program of 2000 and R3 of today.

Match 1: both programs on 2000 hardware.

Match 2: both programs on current hardware

this will show if the current software has problems on much slower hardware due to things like more aggressive null-move, LMR, other types of forward pruning, etc.

Now R3 on 2000 vs 2000 program on 2009 hardware. This will show how much the faster hardware helps the old program.

Now R3 on 2009 hardware vs old program on 2000 hardware. This will show how much faster hardware helps Rybka. Since we have a baseline for both on 2000 hardware we know how much better Rybka is in terms of software. And we just discovered how much the new hardware helps either. Old on new hardware will be the most interesting as I would not be surprised at all to see the old program beat R3 with R3 on old hardware and old program on new hardware...
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Hardware vs Software

Post by bob »

Don wrote:
Uri Blass wrote:
Don wrote:
Dirt wrote:
bob wrote:Rybka is NFG for this test as it has not been around long enough, and its roots have never been clearly defined...
Comparing a top program from the past, like Fritz 5.32, with a current program that is well down the list, like the current Fritz, is completely useless. That wouldn't show how much software has advanced, just Fritz. The only correct comparison is with Fritz or Hiarcs from then with Rybka 3.

While I see no reason to do it, if you want to use the same program it should be equally far down both the old and new rating lists. A candidate would be Shredder 2.0 from 1999 with the current Shredder, although I think it's much too highly rated in the 1999 list to be a fair test.

January 1999 SSDF rating list
It's ok with me to use Rybka as I believe this is good enough to make the point without any dispute.

I don't really remember what was the most common cutting edge hardware. I believe we should use the best hardware of the time that would have been used to evaluate the program on the rating lists. I guess today that would be 4 core machines but I don't know what it was back then. It may be that someone actually has that hardware still and we could rig some test.

- Don
It seems that rybka is using significant amount of memory for its internal arrays(something like 72 mbytes) so rybka needs at least computer with 128 mbytes to run.

Fortunately 128 mbytes were available even 10 years ago but it means that rybka probably cannot use more than 32 mbytes hash(it is not very important because the difference between 32 mbytes hash and bigger hash under 128 mbytes is not a big difference).

10 years ago the ssdf used P200 and they upgraded it to K6-450 in the middle of 1999.

I can agree to ssdf hardware so it means P200 against quad.
Now the problem is that the commercial versions of Fritz or Hiarcs of the beginning of 1999 did not support SMP.

Uri
I remember several programs running on dual core machines at the tournament in Padderborn - I forget which year. Are you sure nothing supported this 10 years ago?

If not, that is a problem. When it comes to software/hardware synergy you can make the argument in either direction. I could say that it's not fair because hardware isn't being utilized (and I claim most of the advance is hardware based) but you could claim that writing a parallel program is a software advance. That argument is wrong of course, because parallel programs have been around a very long time.

So this might come down to an odds match - can an older program that is crippled beat a modern uncommonly strong (and I believe not very representative) program? I think with this kind of unfair handicap this could be a relatively close match.

It would be real easy to run this match remotely via a shell, but Rybka is a Windows program and windows is not as flexible about this kind of stuff, so it would have to be conducted on a local setup somehow.
quad cores were common in 1999. I had a quad-core pentium pro 200 somewhere in late 1996 or 1997, and a quad xeon a couple of years later.