Vintage .... Rating List Winboard from June 1999 (16:42)

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

Frank Quisinsky
Posts: 6808
Joined: Wed Nov 18, 2009 7:16 pm
Location: Gutweiler, Germany
Full name: Frank Quisinsky

Re: Vintage .... Rating List Winboard from June 1999 (16:42)

Post by Frank Quisinsky »

Hello Carldamen,

Voyager ...
A clone of Crafty from Madame Pompadour, sorry ...
Later the Lady wrote that he have only interest to improved the learn files of Crafty.

LaPetite and LaGrande are the next after Voyager.
I missed the LaCraftclony from the Switzerland Dr. Müller only!

My first bad point in testing engines because in the past I can't see that that a Crafty is running.

Best
Frank
carldaman
Posts: 2283
Joined: Sat Jun 02, 2012 2:13 am

Re: Vintage .... Rating List Winboard from June 1999 (16:42)

Post by carldaman »

Voyager = Crafty clone, I see, kind of disappointing, but maybe not all that surprising - in those days they were cloning the top open source engine, and that was Crafty...
Frank Quisinsky
Posts: 6808
Joined: Wed Nov 18, 2009 7:16 pm
Location: Gutweiler, Germany
Full name: Frank Quisinsky

Re: Vintage .... Rating List Winboard from June 1999 (16:42)

Post by Frank Quisinsky »

Hi Larry,

in the past we are thinking that older Crafty versions have around 2500 Elo if strongest commercials, like Hiarcs, Junior, Fritz, are playing with around 2650 Elo. Two strong IMs I know with around 2.400 Elo are thinking, no, no Crafty have not more as 2300 Elo. Endgame isn't on the level IM have and the style of Crafty in middlegames is to passive with problems in king safty.

The commercial engines are overrated at this times and stronger correspondence players are thinking that best chess-software in the years 1998-2000 are not stronger as 2200 Elo (in comparing to correspondence players).

Fact is, that for blitz games the best chess software are very danger for grandmasters.
The opinion I had.

Let us go more back in computer chess history.

DOS Rexchess and DOS Socrates is the topic, sure programs you like.
Stronger in endgames, not good in tactic but with a nice positional playing style.

:-)

DOS Rexchess 2.1 / 2.3 on 486/33Mhz lost 2:8 vs. a IM with 2425 Elo.
DOS Socrates 3 two years later lost 2:8 (same result) vs. the same IM on Pentium 60Mhz.

1994 or 1995 I organize a chess computer team for a chess open in Düsseldorf (Germany).
Tasc R30, Mephisto Risc 1Mb and a PC with Genius 2 and a third chess computer, I believe Mach III.

Strongest team are the team from Düsseldorf (Second League in Germany) and a team from St. Petersburg participate.
IM Donchenko are playing here and very strong young players from St. Petersburg.
One of the young players Novikov with 2.280 Elo won vs. Genius 2, 23.0 : 18.0 on my 486/33Mhz.
Petrov plays not many games vs. chess computer but won all (around 2325 Elo).

St. Petersburg is the twin town from Neuss (in the near from the town Düsseldof).
One time in the year we hold the club from St. Petersburg to us.
Many nice chess-nights we had with the players from St. Petersburg.

Open-Results:

1. Düsseldorf
2. ChessComputers
3. St. Petersburg
4 ... many other teams are follow

This one is an example that ratings are't overrated (your theory).

Really not easy the topic!
Spectrum from good results to bad results are for the first chess programs and in times before ... for the older chess computers very big. I think the reason is the endgame. After a handful games strong players try faster to go in endgames and here 1:0 for humans is standard.

The players from St. Petersburg like Fritz 1/2 most. More tactic = more problems for the humans. Genius are more or less a nice training unity ... bad in middlegames and stronger in endgames.

Best
Frank
carldaman
Posts: 2283
Joined: Sat Jun 02, 2012 2:13 am

Re: Vintage .... Rating List Winboard from June 1999 (16:42)

Post by carldaman »

lkaufman wrote: Mon Aug 03, 2020 11:29 pm
Frank Quisinsky wrote: Mon Aug 03, 2020 1:43 pm Hi there,

I believe this was the last version of "Winboard rating list", a long time before CEGT or CCRL started!

:-)

The good and olt times!
Djordje and myself in background contact a lot of programmers for make Winboard more and more interesting!
With Kai and Christian Koch we played many nice tournaments and Tim Mann have bigger problems
to add all the information about new engines on his site.

I will added the games, later this day in my download selection.
Sure, million of chess peoples have interest to download this material for Stockfish tunings??!

Bob Hyatt was not a big fan from Bionic if I remember!
That's going too far.

Best
Frank


Code: Select all

*****************************************************************************
WB KSQ                  --> 3041 Games <--                     16:42 09.06.99
*****************************************************************************
   Kai Skibbe (Hamburg), Christian Koch (Hamburg), Frank Quisinsky (Trier)
40 moves / 40 minutes, ponder = off, AMD K6-2 333/400 MHz, Celeron 450 MHz !!
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
for Fritz 5-32 ELO calculation,                  58.750 : 25 = 2350 (  0 ELO)
*****************************************************************************
01. Zarkov              4.5e-4.5g            2516 ELO   284 Games   USA  2525
02. Crafty              15.18-16.10          2480 ELO   422 Games   USA  2575
03. Comet               A95-B03              2441 ELO   416 Games   GER  2450
03. Phalanx             17-21                2441 ELO   392 Games   TCH  2450
05. Voyager             2.29-5.03            2430 ELO   326 Games   SUI  2425
06. Nimzo               2000                 2425 ELO   190 Games   AUT  2425
07. Bionic Impakt       4.01                 2424 ELO   190 Games   BEL  2425
08. Patzer              2.99zp-3.0           2409 ELO   424 Games   GER  2400
09. Gromit              2.11x-2.16           2400 ELO   293 Games   GER  2400
10. AnMon               4.09-4.22            2392 ELO   232 Games   FRA  2400
11. ZChess              1.2                  2374 ELO   180 Games   FRA  2375
12. Francesca           0.63-0.68c           2373 ELO   242 Games   ENG  2375
13. The Crazy Bishop    37-43                2358 ELO   358 Games   FRA  2350
14. Little Goliath      1.05-1.41a           2350 ELO   316 Games   GER  2350
15. Bringer             1.2-1.4      !PLAY!  2340 ELO    81 Games   GER  2325
16. Arasan              5.1-5.1a     !NEW!   2324 ELO   170 Games   USA  2325
17. Ant                 3.42-3.61            2281 ELO   186 Games   NDL  2275
18. LambChop            6.9-7.1              2270 ELO   180 Games   ZEA  2275
19. Stobor              B.32-B.56            2256 ELO   218 Games   USA  2250
20. Dragon              3.11         !NEW!   2199 ELO   172 Games   FRA  2200
21. ExChess             2.46-2.51            2187 ELO   250 Games   USA  2175
22. La Dame Blanche     2.0-2.0c     !NEW!   2122 ELO   120 Games   FRA  2125
*****************************************************************************
--. Bionic Impakt       4.11                 2361 ELO   110 Games   BEL  2375
--. Gromit              2.0-2.1              2288 ELO   106 Games   GER  2300
--. ZChess              0.92-1.0             2288 ELO   224 Games   FRA  2300
*****************************************************************************
Back then the engines were evenly matched with strong human players, so I imagine that the level of the rating list was pretty accurate in human terms, assuming some effort was made to make the level of the list realistic. Crafty 15/16 is at 2480 on what we would now consider pathetic hardware. But on the current CCRL 40/15 list, the lowest Crafty, 20.11, is only at 2502 on vastly superior hardware. If the old rating was accurate in human terms, a very much improved Crafty on maybe ten times as fast hardware should easily be over 2700 in human terms. But I think you believe, as do I, that engines with ratings like 2500 or so on the current CCRL list are actually overrated vs. humans. So what is the explanation? Could it really be that many modern engines which would beat the old ones easily are actually weaker vs. humans? If so this really shakes confidence in engine vs engine rating lists. Or have ratings of humans actually deflated by hundreds of elo in two decades? Or something else?
We shouldn't just consider the results of humans who have the right expertise to excel against engines. There are also other players, who may otherwise be of similar strength as the anti-computer experts in OTB play, who do not fare so well when facing an engine opponent. I know because I'm more an example of the latter. Playing only humans, I have an online rapid rating of over 2100 in a couple of places, but I usually struggle against engines rated 1900-2100 CCRL. One main reason for that is that I don't really like to adapt my play, and enjoy playing as if paired with humans and get into messy tactical skirmishes. Where humans would fall for various tricks, or make unprovoked mistakes, the engines are usually strong tactically and all it takes is a couple of bad moves to lose the game. Another reason is a lack of motivation to change the way I play in order to score well, not to mention distractions and poor concentration with so many other things going on. Seeing Dr. Deeb' success got me a little better motivated and was able to defeat Monarch 1.7, rated 2050 on CCRL 40/20 for the first time. I made sure I just stuck to positional play and it was fairly easy to win. OK, I'm not in the same league as Dr. Deeb, but you get the point.

So, the bottom line is that a lot depends not just on the rating/strength of the players, but also their individual style of play and the ability to adapt, and how well they match up with machine opponents. Mileage can really vary here.
Frank Quisinsky
Posts: 6808
Joined: Wed Nov 18, 2009 7:16 pm
Location: Gutweiler, Germany
Full name: Frank Quisinsky

Re: Vintage .... Rating List Winboard from June 1999 (16:42)

Post by Frank Quisinsky »

Hi Larry,

and same situation we have in Winboard times.
All the winboard engines are not very strong in endgames on AMD K6-2 or AMD K6-3.

First program really strong in endgames are Shredder!

Stefan Meyer-Kahlen developed Shredder 3.0 for Winboard too (a secret mission).
I tested WB Shredder 3 vs. the older Crafty, Nimzo or Zarkov version.
No of the strong WB engines have a chance in endgames.

You are the clearly stronger player Larry but in my humble opinion Elo for chess computers or for the first PC programs are not possible from human view.

If you have programs, comes with ...
1.900 Elo after opening book moves
2.450-2.550 in earlier middlegames
2.350-2.450 in late middlgames.
2.100 Elo for transposition into endgame
1.700 Elo for endgames

Is this a very big problem.

The same problem we have today with strongest chess software.

Stockfish:
2.700 Elo after opening book moves
2.900 Elo in ealier middlegames (the only chance for strongest player for draw)
3.300 Elo in late middlgames
3.700 Elo in transposition into endgame
3.200 Elo in endgame

So what for a rating we should give such an engine?

Best
Frank
Frank Quisinsky
Posts: 6808
Joined: Wed Nov 18, 2009 7:16 pm
Location: Gutweiler, Germany
Full name: Frank Quisinsky

Re: Vintage .... Rating List Winboard from June 1999 (16:42)

Post by Frank Quisinsky »

disappointing?!
disappointing??!

Waste of time to test it and I am the biggest idiot and do that.

Today I try to find out more about engines with statistic.
So I have developed my own way and in 85% it helps.

Often I am not sure clone or not.
Engines produced the same style must not be cloned.

Yes, I think I have here a persecution complex / mania.
Bad experience with three _?programmers?_ in the past.

But I have a long list of programs I like a lot, wonderful engines I have many fun here.

Wasp is my main topic!
Just a fantastic program ... today I play three games on DGT-Pi vs. 2200 Wasp Elo and lost after 38 moves, an other game I lost in draw position in endgame but in the second game ... I have +2 (I checked later) and game ended with draw. Wasp is very interesting for myself because I can learn to which time I should play more aggressive with my pawns. Wasp do that in perfection in middlegames.

:-)
Frank Quisinsky
Posts: 6808
Joined: Wed Nov 18, 2009 7:16 pm
Location: Gutweiler, Germany
Full name: Frank Quisinsky

Re: Vintage .... Rating List Winboard from June 1999 (16:42)

Post by Frank Quisinsky »

Hi Larry,

nice posting, nice to read.
Will give you to 100% my agreement.

My problem is that I am looking in fast games only (comp-comp games, human games).
Endgames are not interesting for myself.

This since many years.
So I am thinking, opening is clear, the way I should go is clear.
And I start my aggressive play, like Kasparow's / Shirow's / Iwantchuk attacking moves in the earlier middlegame.

And 80% of games I lost very fast (in 3-5% of games I have luck) ... because one small mistake is enough.
So I am thinking, Frank you never will be a Kasparow, a Shirow or Ivantschuk.
But I found a solution for the problem.

Do you know the chess computer "TheKing Performance"?
300Mhz with the program by Johan de Koning.
In my opinion round 2.325 Elo and you can play vs. 10% of hardware power, 20% of hardware power.
That is really nice!

If I lost more as two games in series vs. 20% hardware power the situation is very easy.
Wasp on DGT Pi plays the next three games and the World is fully OK for myself if TheKing lost each of the games vs. Wasp 2.500 Elo (possible is 2.700 Elo on DGT-Pi). My wife give often the comment ... Frank, I think now it's time for Wasp. I will not hear that.

A great chess computer, believe me!!
Have fun with Wasp on DGT-Pi and TheKing Performance!

In reality I have more chances if I try to search not the king attack.
But for such games I haven't the silence.

Please try out AnMon 5.12!
The style AnMon produced is for humans very interesting.
With a bottle of French wine ... quit clear!

Best and thanks
Frank

Yes, the style of humans is a topic.
Anti-Computerchess is a topic.
Opening systems and so many other things.
lkaufman
Posts: 5960
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA

Re: Vintage .... Rating List Winboard from June 1999 (16:42)

Post by lkaufman »

Frank Quisinsky wrote: Tue Aug 04, 2020 1:02 am Hi Larry,

and same situation we have in Winboard times.
All the winboard engines are not very strong in endgames on AMD K6-2 or AMD K6-3.

First program really strong in endgames are Shredder!

Stefan Meyer-Kahlen developed Shredder 3.0 for Winboard too (a secret mission).
I tested WB Shredder 3 vs. the older Crafty, Nimzo or Zarkov version.
No of the strong WB engines have a chance in endgames.

You are the clearly stronger player Larry but in my humble opinion Elo for chess computers or for the first PC programs are not possible from human view.

If you have programs, comes with ...
1.900 Elo after opening book moves
2.450-2.550 in earlier middlegames
2.350-2.450 in late middlgames.
2.100 Elo for transposition into endgame
1.700 Elo for endgames

Is this a very big problem.

The same problem we have today with strongest chess software.

Stockfish:
2.700 Elo after opening book moves
2.900 Elo in ealier middlegames (the only chance for strongest player for draw)
3.300 Elo in late middlgames
3.700 Elo in transposition into endgame
3.200 Elo in endgame

So what for a rating we should give such an engine?

Best
Frank
Since all engines can use the same opening book, which should be presumed to be a top quality one, and since no human has the memory of a computer, we can rate all engines opening play from book as well above any human, at least 3000. So the human in turn should try to exit book as early as possible without playing a markedly inferior move, assuming that he is playing an equal engine and isn't just aiming for a draw. I know it is pretty much meaningless to speak of a human rating for an engine that is expected to score something like 98% vs. Carlsen; ratings really shouldn't be based on pairings of more than about 200 elo difference. I'm interested in the question of what engine would score 50% against a randomly selected human of 2200, 2300, 2400, 2500, 2600, 2700, and 2800 FIDE, at various time controls. This assumes that there is significant prize money on the line and that the human is well informed about the general strengths and weaknesses of the engines. I don't concern myself with the rating in various parts of the game, just with what rating human is a 50-50 match for a given engine under these conditions. Of course some 2400 humans are much better than others at playing against engines, I'm talking about the average tournament player, who will have some experience with engines but may not be an expert on them. Presumably the rating will be some sort of average of the strengths of the engines in various phases of the games, but we don't need to know this, we only need to measure the elo of the engines in this way. I realize that there aren't many events these days where IM/GM level players are able to play for prize money against (relatively) weak engines, so we have to go by limited information, but at least this clearly defines the question we are trying to answer. Given this definition, are the CCRL 40/15 ratings of engines within the human range too low, too high, or just right on the specified hardware with a time limit of say 45' + 15" increment for engine and human? I don't know the answer, but I hope I have clarified the question!
Komodo rules!
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Vintage .... Rating List Winboard from June 1999 (16:42)

Post by Laskos »

lkaufman wrote: Tue Aug 04, 2020 6:08 am
Frank Quisinsky wrote: Tue Aug 04, 2020 1:02 am Hi Larry,

and same situation we have in Winboard times.
All the winboard engines are not very strong in endgames on AMD K6-2 or AMD K6-3.

First program really strong in endgames are Shredder!

Stefan Meyer-Kahlen developed Shredder 3.0 for Winboard too (a secret mission).
I tested WB Shredder 3 vs. the older Crafty, Nimzo or Zarkov version.
No of the strong WB engines have a chance in endgames.

You are the clearly stronger player Larry but in my humble opinion Elo for chess computers or for the first PC programs are not possible from human view.

If you have programs, comes with ...
1.900 Elo after opening book moves
2.450-2.550 in earlier middlegames
2.350-2.450 in late middlgames.
2.100 Elo for transposition into endgame
1.700 Elo for endgames

Is this a very big problem.

The same problem we have today with strongest chess software.

Stockfish:
2.700 Elo after opening book moves
2.900 Elo in ealier middlegames (the only chance for strongest player for draw)
3.300 Elo in late middlgames
3.700 Elo in transposition into endgame
3.200 Elo in endgame

So what for a rating we should give such an engine?

Best
Frank
Given this definition, are the CCRL 40/15 ratings of engines within the human range too low, too high, or just right on the specified hardware with a time limit of say 45' + 15" increment for engine and human? I don't know the answer, but I hope I have clarified the question!
I would say that in 2500-2700 range CCRL 40/15 engine ratings are very comparable to human ratings for 45' + 15'' or even better 90' + 30'' tc. For much lower ratings, I went to extreme of Micro-Max of HGM of 1900 CCRL rating. I actually played this minimalistic engine. I usually beat it, often because it's deterministic, and I just repeat the game up to a point. So, its human rating should be no more than 1500. I think many engines in this rating range are either mildly buggy or "exploitable" by humans, so their CCRL ratings compared to humans are somewhat inflated. CEGT list is probably better in 1900-2300 Elo range. But CEGT is deflated by some 200 Elo points in the 2600-2700 FIDE Elo range.
lkaufman
Posts: 5960
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA

Re: Vintage .... Rating List Winboard from June 1999 (16:42)

Post by lkaufman »

Laskos wrote: Tue Aug 04, 2020 8:39 am
lkaufman wrote: Tue Aug 04, 2020 6:08 am
Frank Quisinsky wrote: Tue Aug 04, 2020 1:02 am Hi Larry,

and same situation we have in Winboard times.
All the winboard engines are not very strong in endgames on AMD K6-2 or AMD K6-3.

First program really strong in endgames are Shredder!

Stefan Meyer-Kahlen developed Shredder 3.0 for Winboard too (a secret mission).
I tested WB Shredder 3 vs. the older Crafty, Nimzo or Zarkov version.
No of the strong WB engines have a chance in endgames.

You are the clearly stronger player Larry but in my humble opinion Elo for chess computers or for the first PC programs are not possible from human view.

If you have programs, comes with ...
1.900 Elo after opening book moves
2.450-2.550 in earlier middlegames
2.350-2.450 in late middlgames.
2.100 Elo for transposition into endgame
1.700 Elo for endgames

Is this a very big problem.

The same problem we have today with strongest chess software.

Stockfish:
2.700 Elo after opening book moves
2.900 Elo in ealier middlegames (the only chance for strongest player for draw)
3.300 Elo in late middlgames
3.700 Elo in transposition into endgame
3.200 Elo in endgame

So what for a rating we should give such an engine?

Best
Frank
Given this definition, are the CCRL 40/15 ratings of engines within the human range too low, too high, or just right on the specified hardware with a time limit of say 45' + 15" increment for engine and human? I don't know the answer, but I hope I have clarified the question!
I would say that in 2500-2700 range CCRL 40/15 engine ratings are very comparable to human ratings for 45' + 15'' or even better 90' + 30'' tc. For much lower ratings, I went to extreme of Micro-Max of HGM of 1900 CCRL rating. I actually played this minimalistic engine. I usually beat it, often because it's deterministic, and I just repeat the game up to a point. So, its human rating should be no more than 1500. I think many engines in this rating range are either mildly buggy or "exploitable" by humans, so their CCRL ratings compared to humans are somewhat inflated. CEGT list is probably better in 1900-2300 Elo range. But CEGT is deflated by some 200 Elo points in the 2600-2700 FIDE Elo range.
OK, so you are saying that CCRL 2600 (roughly CEGT 2400) = FIDE 2600, and that CCRL 2300 (roughly CEGT 2100) = FIDE 2100. So this means that the engine rating lists substantially UNDERESTIMATE rating differences in human terms!! Both you and I have said the exact opposite many times, I think you estimate something like 300 elo gap on engine list = 200 gap on FIDE; here we have 300 elo gap on engine list = 500 gap on FIDE !! So, am I missing something here? Is this correct, and if so how can we explain such an incredible gap between theory (200 gap) and reality (500 gap)?
Komodo rules!