Deep Blue vs Rybka

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

Milos
Posts: 4190
Joined: Wed Nov 25, 2009 1:47 am

Re: Deep Blue vs Rybka

Post by Milos »

Dann Corbit wrote:1.5 GHz FPGA:
http://www.achronix.com/
Again this is a special case of very targeted FPGAs that you could not use to program a chess things like move generator (or you could do it but certainly not by running at 1.5 GHz).
Wrong again. Deep Blue was ASIC. The article you cited is much latter and describes the things from Deep Blue they implemented using FPGAs.
Maybe it would help if you'd learn basic things like what's the difference between ASIC and FPGA.

Btw. I know the author of the article personally ;).
Last edited by Milos on Tue Sep 14, 2010 7:26 am, edited 1 time in total.
Milos
Posts: 4190
Joined: Wed Nov 25, 2009 1:47 am

Re: Deep Blue vs Rybka

Post by Milos »

Btw. for those who would like to know more about Deep Blue, including both software and hardware, there is a great article:
http://ieeexplore.ieee.org/xpls/abs_all ... ber=625299

Unfortunately you need access to IEEE/ACM to read it full. I could upload it somewhere, but I'm not sure if that is a violation of the copyright.
Uri Blass
Posts: 10280
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: Deep Blue vs Rybka

Post by Uri Blass »

bob wrote:
Uri Blass wrote:
bob wrote:
Don wrote:
Gerd Isenberg wrote:
bob wrote:
uaf wrote:
bob wrote: I recall them losing one game due to a power outage, and one game due to comm problems (Fritz in Hong Kong) which is an incredible streak over 11 years.
And IIRC it was Deep Thought II that lost to Fritz and not Deep Blue as always advertised by Chessbase. Deep Blue was not yet ready.
Confusion caused by IBM. That was "deep blue prototype" which Hsu/Campbell had said was "deep blue software running on deep thought hardware". So you are correct. I was lumping them all together. Chiptest first played in 1986 with serious bugs. It won the ACM event in 1987 and every year after that, only losing the two games I mentioned to the best of my recollection, one on time due to a power failure at the Watson center, one in Hong Kong primarily caused by a comm failure...
Was the game against Mephisto from ACM 1989 that with the power failure?
Deep Blue was remarkably strong for 1997 but it was far from being unbeatable. It was rare but it suffered draws and losses. I think we can estimate that it was about 4-5 years ahead of the PC programs. By 2002 a lot of very smart people believed that Junior or Fritz would beat it in a match. No point arguing about it because we can never know for sure.

I read somewhere (and I'll try to find it) that if you consider various incarnations of Deep Blue that actually played in tournaments, and performance rate it's total results, it is not particularly impressive because it only indicates something like a 200 ELO superiority over the best - but I think of all the games it lost a lot of them were due to unfortunate issues, so this is probably far from a fair metric (also considering that so few games were played.) I think in reality is was stronger than this. A crude calculation is that if it took programs 5 years to catch it, you can guesstimate it's superiority and I think that puts it as more like 400 ELO better than anyone else.

The Deep Blue team was very humble and were a joy to talk with. At the Hong Kong tournament Murray told me that they estimated their winning chances to be right around 50%. That sounds incredible at first unless you do the math. To survive a 5 round tournament with 24 players and have a 50% chance to be the winner you must not only be the best player, but best by a good margin. If their chances were 50%, the chances of the 23 other contestants were divided up among the remaining 50% so that is pretty impressive.

But this tells you that even the Deep Blue team expected to lose games relatively frequently, just much less frequently than anyone else! When it's all soberly analyzed and all the hype removed, Deep Blue stands out as the most outstanding program of it's day, but no more. (I am not sure if some early programs stand out even more, such as Belle or even before than the Chess 4.7 program, they were also seemingly unbeatable so this deserves a fact check.)
If you go back to 1997 when they won the Fredkin prize, they had a FIDE equivalent rating of 2650+. I don't remember the exact number but it was _well_ beyond the Fredkin prize requirement... What micro was close to that in long games. A couple of micros had beaten GM players in blitz (Cray Blitz defeated GM players all over the place in the 90's, as a reference). So they were very strong, and based on deep though vs everyone else thru 1994 ACM-sponsored events, they were clearly well "above and beyond."

How far is debatable. But I would not use that 200 number myself since we have no data for Micros playing super-GM players at 40 moves in 2 hours.
We clearly have data about computers who played humans in 120/40 time control

I remember reading that Fritz3 on P90 could get the IM norm in tournament time control games so it is not correct to say that programs did not play long time games.

No micro was close to 2650 but I am sure that
micro's were at least at 2400 level at that time so 200 elo difference between Deep thought or Deep blue prototype and the best micro's of the same time is not illogical.

Uri
When did Fritz play in such tournaments in the 1987-1988 time frame? The DB project had a pretty daunting task to win the Fredkin stage 3 prize. And my dates were wrong.

DB produced that 2650+ rating in 1988. Not 1997. 1997 was for the final stage of the Fredkin prize, beating the world champion in a match.

So, more correctly, do you believe Fritz in 1988 was within 200 points of a program that had just earned a rating of 2600+ playing 24 games against only GM-level competition? IMHO, not a chance in hades.... Most micros were jokes in 1988...
Fritz3 did it in 1994 or 1995 based on my memory and I did not think about 1988.

I agree that the gap between Deep thought and the micros was more than 200 elo in 1988

I also find it hard to believe the 2650 rating of Deep thought in 1988 because I remember clearly worse results than 2650 for Deep thought after 1988.

These worse results include 2 losses against kasparov in 1989 and a loss against karpov in 1990(I expect 2650 player to score 0.5/2 against kasparov and I also remember that the estimate in the newspaper for Deep thought's level before the games against kasparov was 2550 and not 2650).
These worse result include a tournament in 1991 when Deep thought got performance only slightly more than 2400 and scored 2.5/7 against GM opponents with rating 2480-2560.

I remember deep thought had a good tournament when it scored 6.5/8 and won first place in 1988 but I do not remember other tournaments when Deep thought got performance above 2600

I would like to see list of 24 opponents together with their rating and results and time of the game to understand the basis for the claim that DT got performance that is higher than 2650 in 1988.
Dann Corbit
Posts: 12538
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: Deep Blue vs Rybka

Post by Dann Corbit »

It is plain that you know more about FPGAs than I do. I am somewhat surprised that they do not obey Moore's law.

Is it possible to use the VHDL for an FPGA to create an ordinary CMOS device (like a general purpose CPU)?
Milos
Posts: 4190
Joined: Wed Nov 25, 2009 1:47 am

Re: Deep Blue vs Rybka

Post by Milos »

Dann Corbit wrote:It is plain that you know more about FPGAs than I do. I am somewhat surprised that they do not obey Moore's law.
The quick answer would be density of the connection networks on FPGA chip.
On a regular chip most of the area (and also delay and power penalty) is taken by logic gates and smaller percentage by connection networks.
However with FPGA designs there is a big area (delay/power) overhead due to connectivity (majority of the design area is used for connection network, smaller part for logic gates).
While logic gates (transistor sizes) scale quadratically with technology dimensions (e.g. roughly 2 times reduction when going from 90 to 65nm node), wire widths scale only linearly with technology dimensions (e.g. only 1.4 times reduction when going from 90 to 65nm).
That explains why improvements in performance is smaller between technology nodes for FPGAs than for ASICs.
Is it possible to use the VHDL for an FPGA to create an ordinary CMOS device (like a general purpose CPU)?
With a sufficient number of logic functions FPGA can be used to directly map general purpose CPU cores (of course very simple ones).
However, their main purpose today is for so called rapid prototyping, where your design digital design in VHDL, then first map it into FPGA to perform functional verification, and than after that perform mapping into the given technology (following the standard design-flow) to create an ASIC.
Gian-Carlo Pascutto
Posts: 1243
Joined: Sat Dec 13, 2008 7:00 pm

Re: Deep Blue vs Rybka

Post by Gian-Carlo Pascutto »

Dann Corbit wrote:It is plain that you know more about FPGAs than I do. I am somewhat surprised that they do not obey Moore's law.
Moore's law is about transistor count, not clockspeed. Clockspeed creeps up slowly on FPGAs for reasons mentioned above, but the amount of logic you can get at a certain pricepoint increased much more.

You could probably fit a bunch of DB/Hydra style "cores" into a single "cheap" FPGA now, but then you'd still be stuck with the problem of the parallelization loss.
Is it possible to use the VHDL for an FPGA to create an ordinary CMOS device (like a general purpose CPU)?
Sure (MicroBlaze, Nios, Cortex-M0, etc). This will get you a slow CPU at a relatively large logic cost. There are some uses, but performance is not one of them.

Donninger published enough about Hydra that you can get some rough estimates. They had a 50Mhz (Virtex2) running at 9 cycles per node occupying about 18000 LE, meaning a chip did about 5.5Mnps.

You can then make some assumptions like that a Virtex-6 would attain 100Mhz for the same design, and maybe Spartan-6 50Mhz (Those are of course rough guesses.) Then go hunt on Xilinx's site and AvNet what size FPGA is most cost efficient for fitting as much of those cores on it as possible (make sure to take a conservative estimate because you won't be able to "fill" the FPGA over say 75% without compromising the clockspeed).

Now, divide everything by the ELO loss caused by loss of:
- hashtables last few ply
- no killers, history or SEE in move ordering

And compare to what cluster you can build using commodity hardware. I think it's already clear what side of this comparison I'm on.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Deep Blue vs Rybka

Post by bob »

Uri Blass wrote:
bob wrote:
Uri Blass wrote:
bob wrote:
Don wrote:
Gerd Isenberg wrote:
bob wrote:
uaf wrote:
bob wrote: I recall them losing one game due to a power outage, and one game due to comm problems (Fritz in Hong Kong) which is an incredible streak over 11 years.
And IIRC it was Deep Thought II that lost to Fritz and not Deep Blue as always advertised by Chessbase. Deep Blue was not yet ready.
Confusion caused by IBM. That was "deep blue prototype" which Hsu/Campbell had said was "deep blue software running on deep thought hardware". So you are correct. I was lumping them all together. Chiptest first played in 1986 with serious bugs. It won the ACM event in 1987 and every year after that, only losing the two games I mentioned to the best of my recollection, one on time due to a power failure at the Watson center, one in Hong Kong primarily caused by a comm failure...
Was the game against Mephisto from ACM 1989 that with the power failure?
Deep Blue was remarkably strong for 1997 but it was far from being unbeatable. It was rare but it suffered draws and losses. I think we can estimate that it was about 4-5 years ahead of the PC programs. By 2002 a lot of very smart people believed that Junior or Fritz would beat it in a match. No point arguing about it because we can never know for sure.

I read somewhere (and I'll try to find it) that if you consider various incarnations of Deep Blue that actually played in tournaments, and performance rate it's total results, it is not particularly impressive because it only indicates something like a 200 ELO superiority over the best - but I think of all the games it lost a lot of them were due to unfortunate issues, so this is probably far from a fair metric (also considering that so few games were played.) I think in reality is was stronger than this. A crude calculation is that if it took programs 5 years to catch it, you can guesstimate it's superiority and I think that puts it as more like 400 ELO better than anyone else.

The Deep Blue team was very humble and were a joy to talk with. At the Hong Kong tournament Murray told me that they estimated their winning chances to be right around 50%. That sounds incredible at first unless you do the math. To survive a 5 round tournament with 24 players and have a 50% chance to be the winner you must not only be the best player, but best by a good margin. If their chances were 50%, the chances of the 23 other contestants were divided up among the remaining 50% so that is pretty impressive.

But this tells you that even the Deep Blue team expected to lose games relatively frequently, just much less frequently than anyone else! When it's all soberly analyzed and all the hype removed, Deep Blue stands out as the most outstanding program of it's day, but no more. (I am not sure if some early programs stand out even more, such as Belle or even before than the Chess 4.7 program, they were also seemingly unbeatable so this deserves a fact check.)
If you go back to 1997 when they won the Fredkin prize, they had a FIDE equivalent rating of 2650+. I don't remember the exact number but it was _well_ beyond the Fredkin prize requirement... What micro was close to that in long games. A couple of micros had beaten GM players in blitz (Cray Blitz defeated GM players all over the place in the 90's, as a reference). So they were very strong, and based on deep though vs everyone else thru 1994 ACM-sponsored events, they were clearly well "above and beyond."

How far is debatable. But I would not use that 200 number myself since we have no data for Micros playing super-GM players at 40 moves in 2 hours.
We clearly have data about computers who played humans in 120/40 time control

I remember reading that Fritz3 on P90 could get the IM norm in tournament time control games so it is not correct to say that programs did not play long time games.

No micro was close to 2650 but I am sure that
micro's were at least at 2400 level at that time so 200 elo difference between Deep thought or Deep blue prototype and the best micro's of the same time is not illogical.

Uri
When did Fritz play in such tournaments in the 1987-1988 time frame? The DB project had a pretty daunting task to win the Fredkin stage 3 prize. And my dates were wrong.

DB produced that 2650+ rating in 1988. Not 1997. 1997 was for the final stage of the Fredkin prize, beating the world champion in a match.

So, more correctly, do you believe Fritz in 1988 was within 200 points of a program that had just earned a rating of 2600+ playing 24 games against only GM-level competition? IMHO, not a chance in hades.... Most micros were jokes in 1988...
Fritz3 did it in 1994 or 1995 based on my memory and I did not think about 1988.

I agree that the gap between Deep thought and the micros was more than 200 elo in 1988

I also find it hard to believe the 2650 rating of Deep thought in 1988 because I remember clearly worse results than 2650 for Deep thought after 1988.
You don't have to "believe" it, you can "confirm" it with a quick google search. DB won the fredkin stage 2 prize in 1988. This required a 2550+ rating over 24 consecutive games against only GM-level players. They finished up somewhere in the 2650 area. I don't _ever_ remember worse than 2650 results for DB unless you pick an event like hong-kong where they lost one game out of 5.


These worse results include 2 losses against kasparov in 1989 and a loss against karpov in 1990(I expect 2650 player to score 0.5/2 against kasparov and I also remember that the estimate in the newspaper for Deep thought's level before the games against kasparov was 2550 and not 2650).
These worse result include a tournament in 1991 when Deep thought got performance only slightly more than 2400 and scored 2.5/7 against GM opponents with rating 2480-2560.

I remember deep thought had a good tournament when it scored 6.5/8 and won first place in 1988 but I do not remember other tournaments when Deep thought got performance above 2600

I would like to see list of 24 opponents together with their rating and results and time of the game to understand the basis for the claim that DT got performance that is higher than 2650 in 1988.
Look up the Fredkin prize results. It was discussed at length back then and was quite convincing. No micro 5 years later could approach that. Maybe by 2000 it was barely becoming possible...
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Deep Blue vs Rybka

Post by bob »

Milos wrote:
mhull wrote:Will a "cluster crafty" be able to reach those depths any time soon?
What's the point when "cluster crafty" at today's Bob's Uni cluster would fell short of even SF1.7 on i7, not to mention Rybka 4 or Ivanhoe.
Maybe if you increased current cluster node count tenfold you would get a competitive match.
Current tests show Crafty is 200 below stockfish. Our clusters have a total of over 750 nodes. A doubling is 70 Elo. Do you not think that even _conservatively_ that 750 codes would give 8x the performance. Worst case? Work on that math a bit and think before posting. :)
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Deep Blue vs Rybka

Post by bob »

Milos wrote:
Dann Corbit wrote:Today, we could build the very same FPGAs and get a much higher clock rate. In addition, we could do a recompile of Ivanhoe or some such source code and get a branching factor of 2.

So without much effort, I guess that the Deep Blue team today could get +1000 Elo or so.
That an incredible BS.
FPGAs are slooow. Implementing branching in them is even slower. This is a same kind of statement as saying we could in today's (general purpose) FPGA build a DSP faster than state-of-the-art DSPs from 1997.

But ok, not knowing much about hardware can be an excuse but saying that with today's technology you could build a machine with 3800 elo is just ridiculous (I assume you meant 1000+ elo from original DB, since 1000+ elo from today's state-of-the-art would mean 4300 which we will not see in our lives).
For 500 elo stronger from current best existing software on i7 you would need at least 5000 times more nodes than i7, meaning 40000 nodes hardware with the same software efficiency as software on i7.
And that's nothing but pure SF.
What is the basis for your mathematical model? 500 Elo requires 500 / 70 doublings of speed. Roughly 7x. or 128 time faster. DB hardware could provide that today with little trouble, re-implemented in a much smaller die size. Or even on an FPGA and then replicated 480 times as they did for the 1997 box.
Milos
Posts: 4190
Joined: Wed Nov 25, 2009 1:47 am

Re: Deep Blue vs Rybka

Post by Milos »

bob wrote:Current tests show Crafty is 200 below stockfish. Our clusters have a total of over 750 nodes. A doubling is 70 Elo. Do you not think that even _conservatively_ that 750 codes would give 8x the performance. Worst case? Work on that math a bit and think before posting. :)
I wrote SF on i7, that's 6 real cores each of them twice the strength of your cluster node.

It's 200elo in your measurements. All other "officiel" lists show more than 250. Sorry, but in this case I really don't believe your 200.
You think you can gain 250 elo with 60 times more computing power (6 doublings)??? LOL
You can dream of 70 elo.
Going from 1 to 4 cores Crafty 23.2 gains 95 elo (CCRL data in 40/40, huuuge error margins, realistically is much smaller gain).
Going from 2 to 4 cores Crafty 23.0 gains only 22 elo (CCRL data in 40/4, much smaller error margins, more realistic data).
Going from for example 256 to 512 Crafty 23.3 would not gain more than 20 elo in best case.
Be realistic, we are not kids.