Deep Blue vs Rybka

Don · Post by **Don** » Mon Sep 13, 2010 3:06 am

I think one of the most powerful evidences of Software progress if is you take the case of Deep Blue in 1997, doing 200 MILLION nodes per second and compare it to the OLD Rybka 3, even ignoring the cutting edge i7.

We don't have a precise estimate of the strength of Deep Blue, but we know that in 1997 it beat Gary Kasparov in a close match scoring 3.5 - 2.5.

I don't believe any reasonable person thinks Deep Blue is stronger than Rybka and about 2 or 3 years ago people were saying that any of the top programs would beat Deep Blue in a match pretty easily.

Joel Benjamin, a grandmaster, estimated Deep Blue to be about 2750 ELO in strength at the time of the 1997 match and that Rybka 3 (this was 3 years ago) over 3000 ELO on state of the art hardware of even 3 years ago.

These are both very "fuzzy" numbers (according to George Bush) but to put things into perspective 3 years ago Rybka 3 was giving strong Grandmasters pawn odds in matches and winning those matches. It was already a foregone conclusion that computers now have to be handicapped. This is very strong indication that Deep Blue is a few hundred ELO weaker than Rybka even with far superior hardware.

From comments Bob Hyatt has made over the years Deep Blue represented the state of the art not just in hardware, but also in software. I remember very clearly an assertion he made that no PC program compares to Deep Blue even at the same nodes per second. He gave anecdotal evidence that Deep Blue couldn't be tested against the best programs even when the top programs were turned way up to (or Deep Blue reduced like crazy) to match the NPS and also talked about how Deep Blue had an evaluation function far more sophisticated than any of the best programs made possible by hardware.

So we can take it that Deep Blue represent the very best in state of the art software in 1997 and that Rybka represents the very best in 2007, which is a just a 10 year span.

A good round number is 100 to 1. Rybka on 1 cores i7 is about 2 million nodes per second and Deep Blue is 200 million. This is 100 to 1. I'm not sure Rybka is actually hitting 2 million nodes per second so this is just an estimate.

So Rybka with 100 to 1 time odds handicap is surpassing 1997 state of the art software by something like 200-300 ELO.

bob · Post by **bob** » Mon Sep 13, 2010 5:13 am

Don wrote:I think one of the most powerful evidences of Software progress if is you take the case of Deep Blue in 1997, doing 200 MILLION nodes per second and compare it to the OLD Rybka 3, even ignoring the cutting edge i7.

We don't have a precise estimate of the strength of Deep Blue, but we know that in 1997 it beat Gary Kasparov in a close match scoring 3.5 - 2.5.

I don't believe any reasonable person thinks Deep Blue is stronger than Rybka and about 2 or 3 years ago people were saying that any of the top programs would beat Deep Blue in a match pretty easily.

Joel Benjamin, a grandmaster, estimated Deep Blue to be about 2750 ELO in strength at the time of the 1997 match and that Rybka 3 (this was 3 years ago) over 3000 ELO on state of the art hardware of even 3 years ago.

These are both very "fuzzy" numbers (according to George Bush) but to put things into perspective 3 years ago Rybka 3 was giving strong Grandmasters pawn odds in matches and winning those matches. It was already a foregone conclusion that computers now have to be handicapped. This is very strong indication that Deep Blue is a few hundred ELO weaker than Rybka even with far superior hardware.

From comments Bob Hyatt has made over the years Deep Blue represented the state of the art not just in hardware, but also in software. I remember very clearly an assertion he made that no PC program compares to Deep Blue even at the same nodes per second. He gave anecdotal evidence that Deep Blue couldn't be tested against the best programs even when the top programs were turned way up to (or Deep Blue reduced like crazy) to match the NPS and also talked about how Deep Blue had an evaluation function far more sophisticated than any of the best programs made possible by hardware.

So we can take it that Deep Blue represent the very best in state of the art software in 1997 and that Rybka represents the very best in 2007, which is a just a 10 year span.

A good round number is 100 to 1. Rybka on 1 cores i7 is about 2 million nodes per second and Deep Blue is 200 million. This is 100 to 1. I'm not sure Rybka is actually hitting 2 million nodes per second so this is just an estimate.

So Rybka with 100 to 1 time odds handicap is surpassing 1997 state of the art software by something like 200-300 ELO.

OK, what's up with you nowadays? You simply want to pick a subject and start something that always ends up in a flame war?

DB was basically 1995 hardware. 15 years old. Yet it was faster than anything yet to be produced. Could the software be improved over the intervening 13 years since it beat Kasparov? Of course.

So, what's the point of this particular post. You still hung up on the hardware vs software argument, where my data did not match what you expected (or even what you were claiming it showed)???

You do realize the following:

DB could not use:

(1) killer moves
(2) hash table moves (except in software which was a tiny fraction of the total tree).
(3) no hash table in hardware. Hsu had the capability in the chess chips, but did not have time to build a multi-ported memory unit to make it work.
(4) no shared hash between SP nodes.
(5) no null-move

Etc. It was a fairly simplistic search built around incredibly fast search speed. IE DB did not represent the state-of-the-art with respect to software algorithms, except for the (software only, not in chess chips) singular extension code. So what does comparing Rybka to that mean, exactly? The DT/DB search was a mid-80's search. It was never changed significantly, except for SE. They were trying to tackle this thru blazingly fast hardware speeds. They pretty well succeeded, since in 1997 no other program could touch Kasparov in long tournament-chess games.

If you want to start a discussion like this, at least provide some technical accuracy when talking about deep blue. Your explanation above would get a zero on a test about deep blue.

I also like the "time compression" you use. If I make a stqatement in 1997, it has to hold for all time, even though time marches on? Extremely shady way of trying to make a point. Certainly in 1997 no program could touch the thing. I don't believe anyone could touch them 7-8 years ago. Maybe even 5 years ago. But look around in 1997 and find how many computers had produced true FIDE GM level ratings by actually playing just GMs in tournaments. One.

rbarreira · Post by **rbarreira** » Mon Sep 13, 2010 10:02 am

Hsu said that Deep Blue's parallel efficiency was about 10%, so those 200m - 1bn nodes per second should be scaled down by a factor of about 10 to get something similar to the effective NPS of a single-threaded search. This was due to the massive parallelism it employed, I guess.

bob · Post by **bob** » Mon Sep 13, 2010 3:44 pm

rbarreira wrote:Hsu said that Deep Blue's parallel efficiency was about 10%, so those 200m - 1bn nodes per second should be scaled down by a factor of about 10 to get something similar to the effective NPS of a single-threaded search. This was due to the massive parallelism it employed, I guess.

And the other issues I brought up. It was a very sub-optimal solution, running at super-optimal speeds (for the day). No large shared hash table is a major issue. The limited capabilities for normal practices in the hardware chips (no hashing at all, so no hash best move, no killers, no SE, just a very fast vanilla A-B search. The two-level hardware (30 SP processors connected via message-passing rather than shared memory, each SP node having 16 chess processors that could not communicate with each other and which had very primitive (but _very_ fast) searches.)

And it was _very_ strong, dominating computer chess from 1987 to 1997. I recall them losing one game due to a power outage, and one game due to comm problems (Fritz in Hong Kong) which is an incredible streak over 11 years.

uaf · Post by **uaf** » Mon Sep 13, 2010 4:57 pm

bob wrote: I recall them losing one game due to a power outage, and one game due to comm problems (Fritz in Hong Kong) which is an incredible streak over 11 years.

And IIRC it was Deep Thought II that lost to Fritz and not Deep Blue as always advertised by Chessbase. Deep Blue was not yet ready.

bob · Post by **bob** » Mon Sep 13, 2010 5:43 pm

uaf wrote:
bob wrote: I recall them losing one game due to a power outage, and one game due to comm problems (Fritz in Hong Kong) which is an incredible streak over 11 years.
And IIRC it was Deep Thought II that lost to Fritz and not Deep Blue as always advertised by Chessbase. Deep Blue was not yet ready.

Confusion caused by IBM. That was "deep blue prototype" which Hsu/Campbell had said was "deep blue software running on deep thought hardware". So you are correct. I was lumping them all together. Chiptest first played in 1986 with serious bugs. It won the ACM event in 1987 and every year after that, only losing the two games I mentioned to the best of my recollection, one on time due to a power failure at the Watson center, one in Hong Kong primarily caused by a comm failure...

Don · Post by **Don** » Mon Sep 13, 2010 6:27 pm

rbarreira wrote:Hsu said that Deep Blue's parallel efficiency was about 10%, so those 200m - 1bn nodes per second should be scaled down by a factor of about 10 to get something similar to the effective NPS of a single-threaded search. This was due to the massive parallelism it employed, I guess.

I think I believed that Deep Blue was much better than it actually was. It was impressive in 1997, but It seems that Deep Blue was inferior in every chess specific way except for raw nodes per second due to the constraints of hardware.

So using Deep Blue as a reference point is not going to work.

Gerd Isenberg · Post by **Gerd Isenberg** » Mon Sep 13, 2010 6:52 pm

bob wrote:
uaf wrote:
bob wrote: I recall them losing one game due to a power outage, and one game due to comm problems (Fritz in Hong Kong) which is an incredible streak over 11 years.
And IIRC it was Deep Thought II that lost to Fritz and not Deep Blue as always advertised by Chessbase. Deep Blue was not yet ready.
Confusion caused by IBM. That was "deep blue prototype" which Hsu/Campbell had said was "deep blue software running on deep thought hardware". So you are correct. I was lumping them all together. Chiptest first played in 1986 with serious bugs. It won the ACM event in 1987 and every year after that, only losing the two games I mentioned to the best of my recollection, one on time due to a power failure at the Watson center, one in Hong Kong primarily caused by a comm failure...

Was the game against Mephisto from ACM 1989 that with the power failure?

Don · Post by **Don** » Mon Sep 13, 2010 7:43 pm

Gerd Isenberg wrote:
bob wrote:
uaf wrote:
bob wrote: I recall them losing one game due to a power outage, and one game due to comm problems (Fritz in Hong Kong) which is an incredible streak over 11 years.
And IIRC it was Deep Thought II that lost to Fritz and not Deep Blue as always advertised by Chessbase. Deep Blue was not yet ready.
Confusion caused by IBM. That was "deep blue prototype" which Hsu/Campbell had said was "deep blue software running on deep thought hardware". So you are correct. I was lumping them all together. Chiptest first played in 1986 with serious bugs. It won the ACM event in 1987 and every year after that, only losing the two games I mentioned to the best of my recollection, one on time due to a power failure at the Watson center, one in Hong Kong primarily caused by a comm failure...
Was the game against Mephisto from ACM 1989 that with the power failure?

Deep Blue was remarkably strong for 1997 but it was far from being unbeatable. It was rare but it suffered draws and losses. I think we can estimate that it was about 4-5 years ahead of the PC programs. By 2002 a lot of very smart people believed that Junior or Fritz would beat it in a match. No point arguing about it because we can never know for sure.

I read somewhere (and I'll try to find it) that if you consider various incarnations of Deep Blue that actually played in tournaments, and performance rate it's total results, it is not particularly impressive because it only indicates something like a 200 ELO superiority over the best - but I think of all the games it lost a lot of them were due to unfortunate issues, so this is probably far from a fair metric (also considering that so few games were played.) I think in reality is was stronger than this. A crude calculation is that if it took programs 5 years to catch it, you can guesstimate it's superiority and I think that puts it as more like 400 ELO better than anyone else.

The Deep Blue team was very humble and were a joy to talk with. At the Hong Kong tournament Murray told me that they estimated their winning chances to be right around 50%. That sounds incredible at first unless you do the math. To survive a 5 round tournament with 24 players and have a 50% chance to be the winner you must not only be the best player, but best by a good margin. If their chances were 50%, the chances of the 23 other contestants were divided up among the remaining 50% so that is pretty impressive.

But this tells you that even the Deep Blue team expected to lose games relatively frequently, just much less frequently than anyone else! When it's all soberly analyzed and all the hype removed, Deep Blue stands out as the most outstanding program of it's day, but no more. (I am not sure if some early programs stand out even more, such as Belle or even before than the Chess 4.7 program, they were also seemingly unbeatable so this deserves a fact check.)

bob · Post by **bob** » Mon Sep 13, 2010 7:48 pm

Gerd Isenberg wrote:
bob wrote:
uaf wrote:
bob wrote: I recall them losing one game due to a power outage, and one game due to comm problems (Fritz in Hong Kong) which is an incredible streak over 11 years.
And IIRC it was Deep Thought II that lost to Fritz and not Deep Blue as always advertised by Chessbase. Deep Blue was not yet ready.
Confusion caused by IBM. That was "deep blue prototype" which Hsu/Campbell had said was "deep blue software running on deep thought hardware". So you are correct. I was lumping them all together. Chiptest first played in 1986 with serious bugs. It won the ACM event in 1987 and every year after that, only losing the two games I mentioned to the best of my recollection, one on time due to a power failure at the Watson center, one in Hong Kong primarily caused by a comm failure...
Was the game against Mephisto from ACM 1989 that with the power failure?

I actually do not remember. Did they play them twice? It seems that whomever beat them during the power failure actually was paired against them again in the last round. Lots of grumbling by that particular programmer, since they had to play them twice. I will plead ignorance as to who it was. Mike Valvo was the TD, however. I will try to scrounge thru my ACM folder (hard-copy stuff) to see if I can find a reference somewhere in one of the old tournament booklets that were printed each year.

The one thing I do remember was that DB was up and running using the emergency power at the Watson, but their PBX (private phone system) was dead due to no power so they could not contact the machine. They waited for over an hour before everyone agreed that the machine was not going to be accessible (this was a night game) until the next morning most likely. Remarkable thing was they _still_ won the tournament clearly, no tie-breaks.

Deep Blue vs Rybka

Deep Blue vs Rybka

Re: Deep Blue vs Rybka

Re: Deep Blue vs Rybka

Re: Deep Blue vs Rybka

Re: Deep Blue vs Rybka

Re: Deep Blue vs Rybka

Re: Deep Blue vs Rybka

Re: Deep Blue vs Rybka

Re: Deep Blue vs Rybka

Re: Deep Blue vs Rybka