CCRL live lists with 100 Elo reduction

Discussion of computer chess matches and engine tournaments.

Moderators: hgm, Rebel, chrisw

Mincho Georgiev
Posts: 454
Joined: Sat Apr 04, 2009 6:44 pm
Location: Bulgaria

Re: CCRL live lists with 100 Elo reduction

Post by Mincho Georgiev »

Adam Hair wrote:
Mincho Georgiev wrote:
Adam Hair wrote:To anyone who reads this:

What would be your reaction if we purposely disconnected the CCRL from any comparison to human ratings?

What if we make the rating for the top engine equal 0 Elo, so that the ratings are such that the rating of each engine directly indicates how many Elo it is behind the leading program?
My idea is probably a funny one, but it's what would bring me some comfort with any rating list. There are enough strong FIDE players here that can estimate one single program. If anyone is agreed to do that, use that program for a base.
Your idea would at least include input from the top chess practioners.
Why is that? Nobody said that the evaluated engine should be 3000+ one, instead, something closer to the category of the player. Miguel for example is FM /correct me if I'm wrong/, it would be irrelevant if he plays against 3100 elo engine. Of course I'm far from the though that this could end up with an exact estimate, just not possible, but it would give a number /+-100 elo/ which can help people to get better idea for the strength of their engines of interest. Plus, it's one more thing that would make the rating list unique, not that CCRL necessarily need it, but ...
Adam Hair
Posts: 3226
Joined: Wed May 06, 2009 10:31 pm
Location: Fuquay-Varina, North Carolina

Re: CCRL live lists with 100 Elo reduction

Post by Adam Hair »

Mincho Georgiev wrote:
Adam Hair wrote:
Mincho Georgiev wrote:
Adam Hair wrote:To anyone who reads this:

What would be your reaction if we purposely disconnected the CCRL from any comparison to human ratings?

What if we make the rating for the top engine equal 0 Elo, so that the ratings are such that the rating of each engine directly indicates how many Elo it is behind the leading program?
My idea is probably a funny one, but it's what would bring me some comfort with any rating list. There are enough strong FIDE players here that can estimate one single program. If anyone is agreed to do that, use that program for a base.
Your idea would at least include input from the top chess practioners.
Why is that? Nobody said that the evaluated engine should be 3000+ one, instead, something closer to the category of the player. Miguel for example is FM /correct me if I'm wrong/, it would be irrelevant if he plays against 3100 elo engine. Of course I'm far from the though that this could end up with an exact estimate, just not possible, but it would give a number /+-100 elo/ which can help people to get better idea for the strength of their engines of interest. Plus, it's one more thing that would make the rating list unique, not that CCRL necessarily need it, but ...
I did not have any engines in mind when I responded. When you wrote "strong FIDE players", I assumed you meant GMs.
mloftus955
Posts: 92
Joined: Tue Jun 08, 2010 5:36 pm
Location: Westfield, IN
Full name: mark loftus

Re: CCRL live lists with 100 Elo reduction

Post by mloftus955 »

IWB wrote:
CRoberson wrote:That is the fundamental problem with conformity in these situations. Do you do what is correct or what the majority can deal with?
That depends on the importance of the issue. In this case it is neither of what you implied as it makes no difference to have one or the other solution - except the amount of work one has with it ... !

Bye
Ingo


To me, as someone who has played the game, I want to see the ratings as in tune with rating list for tournaments like FIDE or USCF as much as possible. This is what the computer rating lists were attempting to do as much as possible from the beginning by including masters, etc.

I think its a good idea, not mere conforming...

Mark
IWB
Posts: 1539
Joined: Thu Mar 09, 2006 2:02 pm

Re: CCRL live lists with 100 Elo reduction

Post by IWB »

mloftus955 wrote:
To me, as someone who has played the game, I want to see the ratings as in tune with rating list for tournaments like FIDE or USCF as much as possible. This is what the computer rating lists were attempting to do as much as possible from the beginning by including masters, etc.

I think its a good idea, not mere conforming...

Mark
I fully agree and I would love to add some master, grand master or world champions. Alone, they don't want to play. So all, really ALL, list nowadays are guessing. There are no valid data about how an engine would performe vs humans (and especialy not in a blind test - THAT would be nice. Me with a little earplug getting the answers from a comp - where will I be within 1, 2. ... 5 years 8-) )

So if the CCRL, the CEGT, IPON or someone else is right is a matter of personal taste. All I try to say, and you basicaly confirmed it, is that people are more confused by a 0 and negative values as if the list makers make their best guess ...

Bye
Ingo
pichy
Posts: 2564
Joined: Thu Mar 09, 2006 3:04 am

Re: CCRL live lists with 100 Elo reduction

Post by pichy »

Graham Banks wrote:The latest CCRL Rating Lists and Statistics are available for viewing from the following links:
http://computerchess.org.uk/ccrl/4040.live/ (40/40)
http://www.computerchess.org.uk/ccrl/404.live/ (40/4)
http://www.computerchess.org.uk/ccrl/404FRC/ (FRC 40/4)

Please note that the three lists are often updated separately to each other.
The links given in each update report will give you the currently up to date lists.

The 100 Elo reduction in the main website lists don't show up yet, but will eventually.
I believe that a 150 Elo reduction is more reasonable than 100, it is only my feeling to make it closer to the Human FIDE rating system, since certain programs rated between 2325 to 2350 are not even close to a human FIDE Master rated 2200 :wink:
Sven
Posts: 4052
Joined: Thu May 15, 2008 9:57 pm
Location: Berlin, Germany
Full name: Sven Schüle

Re: CCRL live lists with 100 Elo reduction

Post by Sven »

pichy wrote:
Graham Banks wrote:The latest CCRL Rating Lists and Statistics are available for viewing from the following links:
http://computerchess.org.uk/ccrl/4040.live/ (40/40)
http://www.computerchess.org.uk/ccrl/404.live/ (40/4)
http://www.computerchess.org.uk/ccrl/404FRC/ (FRC 40/4)

Please note that the three lists are often updated separately to each other.
The links given in each update report will give you the currently up to date lists.

The 100 Elo reduction in the main website lists don't show up yet, but will eventually.
I believe that a 150 Elo reduction is more reasonable than 100, it is only my feeling to make it closer to the Human FIDE rating system, since certain programs rated between 2325 to 2350 are not even close to a human FIDE Master rated 2200 :wink:
I see no point at all in even attempting to compare engine ratings to human player ratings, so I propose to refrain from any rescaling and keep the original scaling: it is as arbitrary as any other scaling. Trying to achieve some common rating level serves no useful purpose IMO. Engine and human ratings are two different animals for various reasons. Only a huge pool of games including a sufficient amount of engine-human, engine-engine as well as human-human games could allow to get something like remotely comparable ratings, but even then some hardly solvable problems would remain:

1) One of those problems is the usually small number of games of human players compared to that of engines, leading to less significant ratings of humans within such a combined rating pool.

2) Another one, perhaps the most important one, is the fact that the strength of human players evolves over time, in contrast to engines which have to be assumed as having "constant" strength given one specific program version and one specific environment of computer system + settings. Definitely two completely different rating approaches are needed for that (and are applied in reality), one for humans where the most recent games get higher weights than older games, and one for engines where all games are put together and rated at once. Now which of these two incompatible rating systems should be applied to "engine vs. human" games? Applying the "human" rating system also to engines would require a huge number of "engine vs. human" games to minimize the possible bias introduced by overweighting the most recent (engine) game results. Applying the "engine" rating system to humans would require to either accept some permanent inaccuracies for players with instable playing strength, or to create some artificial "human player version" concept, like "Anand (2012)", which I doubt anyone would accept as useful, and which again would suffer heavily from an insufficient number of games to obtain statistical significance.

Chess ratings, whether for humans or for engines, are never "absolute". Their meaning is always relative to the set of players included in the overall rating pool, and the set of games between those players, and of course the whole set of playing conditions. Two separate rating pools can never be compared. We should always resist from trying to give more meaning to a FIDE ELO rating of, say, 2750, than it actually has, and will ever have. An ELO rating is not a "physical property" of a chess player, it is a relative statement within a specified and limited context.

The only statements we can, and should, derive from a given rating list are statements like "A is best in rating pool X" or "B is better than C in rating pool Y", or maybe also "D has a winning probability of P % [+/- error margin M] against E in rating pool Z".

Sven
Sedat Canbaz
Posts: 3018
Joined: Thu Mar 09, 2006 11:58 am
Location: Antalya/Turkey

Re: CCRL live lists with 100 Elo reduction

Post by Sedat Canbaz »

I see the the Elo Engine vs Human disscutions are still active :)

Actually there are quite enough chess data for right measuring the Elo strenght between chess engines vs humans

So once more i will try to mention again about the game results,which i are already played:



1)IV Magistral República Argentina 2001

http://www.rebel.nl/resu.htm

Code: Select all

                                   1 2 3 4 5 6 7 8 9 0 1 2 

1   COMP Chess Tiger        2632  +156  * ½ ½ 1 1 ½ 1 1 1 1 1 1   9.5/11

2   Slipak,Sergio      IM   2448  +159  ½ * ½ ½ 1 ½ ½ 1 ½ 1 1 ½   7.5/11

3   Valerga,Diego      IM   2468  +138  ½ ½ * ½ ½ 1 1 ½ ½ 1 ½ 1   7.5/11

4   Ricardi,Pablo      GM   2554   -27  0 ½ ½ * ½ ½ 1 1 ½ ½ ½ 1   6.5/11

5   Hoffman,Alejandro  GM   2453   +82  0 0 ½ ½ * 1 ½ ½ 1 1 ½ 1   6.5/11

6   Limp,Eduardo       IM   2465   -30  ½ ½ 0 ½ 0 * 0 ½ 1 0 1 1   5.0/11

7   Scarella,Enrique   FM   2361   +82  0 ½ 0 0 ½ 1 * ½ ½ 0 1 1   5.0/11

8   Panno,Oscar        GM   2471   -37  0 0 ½ 0 ½ ½ ½ * ½ 1 1 ½   5.0/11

9   Andres,Miguel      IM   2382    -8  0 ½ ½ ½ 0 0 ½ ½ * ½ ½ ½   4.0/11

10  Rodriguez,Andrés   GM   2500  -173  0 0 0 ½ 0 1 1 0 ½ * 0 ½   3.5/11

11  Dorin,Mauricio     IM   2410  -115  0 0 ½ ½ ½ 0 0 0 ½ 1 * 0   3.0/11

12  Matsuura,Everaldo  IM   2467  -177  0 ½ 0 0 0 0 0 ½ ½ ½ 1 *   3.0/11
2)Mescosur Cup 2009 - even Pocket Hiarcs performed 2938 Elo:
http://www.hiarcs.com/Games/Mercosur2009/mercosur09.htm

3)Another example,even 10-13 years ago,the engines are started to play on same level as Top Grandmasters

Image

Note:in those years,mostly of the games are played at slow time controls,e.g in blitz the engines would be performed much stronger than Grandmasters


4)Rybka even without full pawn (on slow hardware,without strong book) performed stronger than Grandmasters
Image


In other words,
-In 1990 years = GMs were stronger than Chess Engines (due to in those years the hardwares were very slow and the engines were much weaker)
-In the early of 2000 years = GMs were approx.equal to Chess Engines
-In 2003/2004/2005/2006 = Chess Engines performed approx. 190 Elo better than GMs
-Now we are in 2012 = We have much stronger MP engines (Houdini,Rybka,Critter,Ivanhoe,Fire,Stockfish...) and much faster hardwares than the past

And last:I strongly believe the Top MP Engines (using a strong book+latest fast i7 6 core machine) should be rated around 3300-3400 Elo


Hope this helps,


Best,
Sedat
Sedat Canbaz
Posts: 3018
Joined: Thu Mar 09, 2006 11:58 am
Location: Antalya/Turkey

Re: CCRL live lists with 100 Elo reduction

Post by Sedat Canbaz »

Btw,my 2 cents more over this issue

There was also another duel in 2003:Garry Kasparov vs X3D Fritz (match ends 2-2)
http://en.wikipedia.org/wiki/X3D_Fritz
Note that X3D Fritz (Deep Fritz 8)'s hardware was four Intel Pentium 4 Xeon CPUs at 2.8 GHz

Deep Fritz8 is the new multi-processor version of Fritz, the same chess engine which drew
a match with Garry Kasparov last month ....

http://www.chessbase.com/newsdetail.asp?newsid=1375


According to CEGT, Deep Fritz 12 is 240 Elo stronger than Deep Fritz 8
http://www.husvankempen.de/nunn/40_40%2 ... liste.html

Code: Select all

no	Program	                 Elo	+	-	Games	Score	Av.Op.	Draws
117 	Deep Fritz 12 2CPU 	2823 	14 	14 	1199 	45.3% 	2856 	48.6%
489 	Deep Fritz 8 2CPU 	2583 	11 	11 	2804 	53.7% 	2557 	31.2%

------------------------------------------------------------------------------------------------------------


Garry Kasparov vs Deep Junior - 2003 (ends in 3-3 draw)

Note that Deep Junior 2003 (the played version against Kasparov) is expecting to have same strenght as Deep Junir 8

According to SSDF,Deep Junior 12 is 214 Elo stronger than Deep Junior 8
http://ssdf.bosjo.net/list.htm

Code: Select all

	                                      Rating	 +	 -     Games	Won	Av.opp
 
10	Deep Junior 12 x64 2GB Q6600 2,4 GHz 	3078	27	-25	778	68%	2948
7	Deep Junior 8 2GB Q6600 2,4 GHz 	2864	26	-27	745	33%	2984

-------------------------------------------------------------------------------------------------------------


And now lets concentrate over SCCT Rating List:
http://www.sedatcanbaz.com/chess/scct-rating/

Code: Select all

Rank Name                       Elo    +    -   games score oppo. draws 
  14 Deep Fritz 12 w32 6c       3138   23   23   554   41%  3194   41% 
  17 Deep Junior 13 x64 6c      3104   24   24   543   36%  3200   34% 

And i have a question to all,who dont agree with the current SCCT Elo Calculations:

-Would not be a big mistake,if we rate the above engines less than 3000 Elo (Deep Junior 13+Deep Fritz 12 on i7 980X @4.0GHz 6 core) ??

A little note more:i7 980X @4.0GHz 6 core is at least 150 Elo stronger than Intel Pentium 4 Xeon CPUs at 2.8 GHz


Greetings,
Sedat
Piotr Cichy
Posts: 75
Joined: Sun Jul 30, 2006 11:13 pm
Location: Kalisz, Poland

Re: CCRL live lists with 100 Elo reduction

Post by Piotr Cichy »

I don't understand those ELO reductions.

CEGT ratings were about -50 ELO comparing to CCRL. This difference was not too big.

Now CCRL ratings are 100 ELO lower and CEGT ratings 200 ELO lower, so the difference between CCRL and CEGT increased to 150 ELO.

IMO it would be better to select ONE very known engine with large amount of games, give it constant ELO and keep it the same on all ranking lists.
pichy
Posts: 2564
Joined: Thu Mar 09, 2006 3:04 am

Re: CCRL live lists with 100 Elo reduction

Post by pichy »

Piotr Cichy wrote:I don't understand those ELO reductions.

CEGT ratings were about -50 ELO comparing to CCRL. This difference was not too big.

Now CCRL ratings are 100 ELO lower and CEGT ratings 200 ELO lower, so the difference between CCRL and CEGT increased to 150 ELO.

IMO it would be better to select ONE very known engine with large amount of games, give it constant ELO and keep it the same on all ranking lists.
My suggestion would be for all the different ratings agencies to get together and agree on a constant ratings with one starting point, otherwise how do we know which agency rating system reflect the most accurate ratings :wink: