Why is that? Nobody said that the evaluated engine should be 3000+ one, instead, something closer to the category of the player. Miguel for example is FM /correct me if I'm wrong/, it would be irrelevant if he plays against 3100 elo engine. Of course I'm far from the though that this could end up with an exact estimate, just not possible, but it would give a number /+-100 elo/ which can help people to get better idea for the strength of their engines of interest. Plus, it's one more thing that would make the rating list unique, not that CCRL necessarily need it, but ...Adam Hair wrote:Your idea would at least include input from the top chess practioners.Mincho Georgiev wrote:My idea is probably a funny one, but it's what would bring me some comfort with any rating list. There are enough strong FIDE players here that can estimate one single program. If anyone is agreed to do that, use that program for a base.Adam Hair wrote:To anyone who reads this:
What would be your reaction if we purposely disconnected the CCRL from any comparison to human ratings?
What if we make the rating for the top engine equal 0 Elo, so that the ratings are such that the rating of each engine directly indicates how many Elo it is behind the leading program?
CCRL live lists with 100 Elo reduction
Moderators: hgm, Rebel, chrisw
-
- Posts: 454
- Joined: Sat Apr 04, 2009 6:44 pm
- Location: Bulgaria
Re: CCRL live lists with 100 Elo reduction
-
- Posts: 3226
- Joined: Wed May 06, 2009 10:31 pm
- Location: Fuquay-Varina, North Carolina
Re: CCRL live lists with 100 Elo reduction
I did not have any engines in mind when I responded. When you wrote "strong FIDE players", I assumed you meant GMs.Mincho Georgiev wrote:Why is that? Nobody said that the evaluated engine should be 3000+ one, instead, something closer to the category of the player. Miguel for example is FM /correct me if I'm wrong/, it would be irrelevant if he plays against 3100 elo engine. Of course I'm far from the though that this could end up with an exact estimate, just not possible, but it would give a number /+-100 elo/ which can help people to get better idea for the strength of their engines of interest. Plus, it's one more thing that would make the rating list unique, not that CCRL necessarily need it, but ...Adam Hair wrote:Your idea would at least include input from the top chess practioners.Mincho Georgiev wrote:My idea is probably a funny one, but it's what would bring me some comfort with any rating list. There are enough strong FIDE players here that can estimate one single program. If anyone is agreed to do that, use that program for a base.Adam Hair wrote:To anyone who reads this:
What would be your reaction if we purposely disconnected the CCRL from any comparison to human ratings?
What if we make the rating for the top engine equal 0 Elo, so that the ratings are such that the rating of each engine directly indicates how many Elo it is behind the leading program?
-
- Posts: 92
- Joined: Tue Jun 08, 2010 5:36 pm
- Location: Westfield, IN
- Full name: mark loftus
Re: CCRL live lists with 100 Elo reduction
IWB wrote:That depends on the importance of the issue. In this case it is neither of what you implied as it makes no difference to have one or the other solution - except the amount of work one has with it ... !CRoberson wrote:That is the fundamental problem with conformity in these situations. Do you do what is correct or what the majority can deal with?
Bye
Ingo
To me, as someone who has played the game, I want to see the ratings as in tune with rating list for tournaments like FIDE or USCF as much as possible. This is what the computer rating lists were attempting to do as much as possible from the beginning by including masters, etc.
I think its a good idea, not mere conforming...
Mark
-
- Posts: 1539
- Joined: Thu Mar 09, 2006 2:02 pm
Re: CCRL live lists with 100 Elo reduction
I fully agree and I would love to add some master, grand master or world champions. Alone, they don't want to play. So all, really ALL, list nowadays are guessing. There are no valid data about how an engine would performe vs humans (and especialy not in a blind test - THAT would be nice. Me with a little earplug getting the answers from a comp - where will I be within 1, 2. ... 5 years )mloftus955 wrote:
To me, as someone who has played the game, I want to see the ratings as in tune with rating list for tournaments like FIDE or USCF as much as possible. This is what the computer rating lists were attempting to do as much as possible from the beginning by including masters, etc.
I think its a good idea, not mere conforming...
Mark
So if the CCRL, the CEGT, IPON or someone else is right is a matter of personal taste. All I try to say, and you basicaly confirmed it, is that people are more confused by a 0 and negative values as if the list makers make their best guess ...
Bye
Ingo
-
- Posts: 2564
- Joined: Thu Mar 09, 2006 3:04 am
Re: CCRL live lists with 100 Elo reduction
I believe that a 150 Elo reduction is more reasonable than 100, it is only my feeling to make it closer to the Human FIDE rating system, since certain programs rated between 2325 to 2350 are not even close to a human FIDE Master rated 2200Graham Banks wrote:The latest CCRL Rating Lists and Statistics are available for viewing from the following links:
http://computerchess.org.uk/ccrl/4040.live/ (40/40)
http://www.computerchess.org.uk/ccrl/404.live/ (40/4)
http://www.computerchess.org.uk/ccrl/404FRC/ (FRC 40/4)
Please note that the three lists are often updated separately to each other.
The links given in each update report will give you the currently up to date lists.
The 100 Elo reduction in the main website lists don't show up yet, but will eventually.
-
- Posts: 4052
- Joined: Thu May 15, 2008 9:57 pm
- Location: Berlin, Germany
- Full name: Sven Schüle
Re: CCRL live lists with 100 Elo reduction
I see no point at all in even attempting to compare engine ratings to human player ratings, so I propose to refrain from any rescaling and keep the original scaling: it is as arbitrary as any other scaling. Trying to achieve some common rating level serves no useful purpose IMO. Engine and human ratings are two different animals for various reasons. Only a huge pool of games including a sufficient amount of engine-human, engine-engine as well as human-human games could allow to get something like remotely comparable ratings, but even then some hardly solvable problems would remain:pichy wrote:I believe that a 150 Elo reduction is more reasonable than 100, it is only my feeling to make it closer to the Human FIDE rating system, since certain programs rated between 2325 to 2350 are not even close to a human FIDE Master rated 2200Graham Banks wrote:The latest CCRL Rating Lists and Statistics are available for viewing from the following links:
http://computerchess.org.uk/ccrl/4040.live/ (40/40)
http://www.computerchess.org.uk/ccrl/404.live/ (40/4)
http://www.computerchess.org.uk/ccrl/404FRC/ (FRC 40/4)
Please note that the three lists are often updated separately to each other.
The links given in each update report will give you the currently up to date lists.
The 100 Elo reduction in the main website lists don't show up yet, but will eventually.
1) One of those problems is the usually small number of games of human players compared to that of engines, leading to less significant ratings of humans within such a combined rating pool.
2) Another one, perhaps the most important one, is the fact that the strength of human players evolves over time, in contrast to engines which have to be assumed as having "constant" strength given one specific program version and one specific environment of computer system + settings. Definitely two completely different rating approaches are needed for that (and are applied in reality), one for humans where the most recent games get higher weights than older games, and one for engines where all games are put together and rated at once. Now which of these two incompatible rating systems should be applied to "engine vs. human" games? Applying the "human" rating system also to engines would require a huge number of "engine vs. human" games to minimize the possible bias introduced by overweighting the most recent (engine) game results. Applying the "engine" rating system to humans would require to either accept some permanent inaccuracies for players with instable playing strength, or to create some artificial "human player version" concept, like "Anand (2012)", which I doubt anyone would accept as useful, and which again would suffer heavily from an insufficient number of games to obtain statistical significance.
Chess ratings, whether for humans or for engines, are never "absolute". Their meaning is always relative to the set of players included in the overall rating pool, and the set of games between those players, and of course the whole set of playing conditions. Two separate rating pools can never be compared. We should always resist from trying to give more meaning to a FIDE ELO rating of, say, 2750, than it actually has, and will ever have. An ELO rating is not a "physical property" of a chess player, it is a relative statement within a specified and limited context.
The only statements we can, and should, derive from a given rating list are statements like "A is best in rating pool X" or "B is better than C in rating pool Y", or maybe also "D has a winning probability of P % [+/- error margin M] against E in rating pool Z".
Sven
-
- Posts: 3018
- Joined: Thu Mar 09, 2006 11:58 am
- Location: Antalya/Turkey
Re: CCRL live lists with 100 Elo reduction
I see the the Elo Engine vs Human disscutions are still active
Actually there are quite enough chess data for right measuring the Elo strenght between chess engines vs humans
So once more i will try to mention again about the game results,which i are already played:
1)IV Magistral República Argentina 2001
http://www.rebel.nl/resu.htm
2)Mescosur Cup 2009 - even Pocket Hiarcs performed 2938 Elo:
http://www.hiarcs.com/Games/Mercosur2009/mercosur09.htm
3)Another example,even 10-13 years ago,the engines are started to play on same level as Top Grandmasters
Note:in those years,mostly of the games are played at slow time controls,e.g in blitz the engines would be performed much stronger than Grandmasters
4)Rybka even without full pawn (on slow hardware,without strong book) performed stronger than Grandmasters
In other words,
-In 1990 years = GMs were stronger than Chess Engines (due to in those years the hardwares were very slow and the engines were much weaker)
-In the early of 2000 years = GMs were approx.equal to Chess Engines
-In 2003/2004/2005/2006 = Chess Engines performed approx. 190 Elo better than GMs
-Now we are in 2012 = We have much stronger MP engines (Houdini,Rybka,Critter,Ivanhoe,Fire,Stockfish...) and much faster hardwares than the past
And last:I strongly believe the Top MP Engines (using a strong book+latest fast i7 6 core machine) should be rated around 3300-3400 Elo
Hope this helps,
Best,
Sedat
Actually there are quite enough chess data for right measuring the Elo strenght between chess engines vs humans
So once more i will try to mention again about the game results,which i are already played:
1)IV Magistral República Argentina 2001
http://www.rebel.nl/resu.htm
Code: Select all
1 2 3 4 5 6 7 8 9 0 1 2
1 COMP Chess Tiger 2632 +156 * ½ ½ 1 1 ½ 1 1 1 1 1 1 9.5/11
2 Slipak,Sergio IM 2448 +159 ½ * ½ ½ 1 ½ ½ 1 ½ 1 1 ½ 7.5/11
3 Valerga,Diego IM 2468 +138 ½ ½ * ½ ½ 1 1 ½ ½ 1 ½ 1 7.5/11
4 Ricardi,Pablo GM 2554 -27 0 ½ ½ * ½ ½ 1 1 ½ ½ ½ 1 6.5/11
5 Hoffman,Alejandro GM 2453 +82 0 0 ½ ½ * 1 ½ ½ 1 1 ½ 1 6.5/11
6 Limp,Eduardo IM 2465 -30 ½ ½ 0 ½ 0 * 0 ½ 1 0 1 1 5.0/11
7 Scarella,Enrique FM 2361 +82 0 ½ 0 0 ½ 1 * ½ ½ 0 1 1 5.0/11
8 Panno,Oscar GM 2471 -37 0 0 ½ 0 ½ ½ ½ * ½ 1 1 ½ 5.0/11
9 Andres,Miguel IM 2382 -8 0 ½ ½ ½ 0 0 ½ ½ * ½ ½ ½ 4.0/11
10 Rodriguez,Andrés GM 2500 -173 0 0 0 ½ 0 1 1 0 ½ * 0 ½ 3.5/11
11 Dorin,Mauricio IM 2410 -115 0 0 ½ ½ ½ 0 0 0 ½ 1 * 0 3.0/11
12 Matsuura,Everaldo IM 2467 -177 0 ½ 0 0 0 0 0 ½ ½ ½ 1 * 3.0/11
http://www.hiarcs.com/Games/Mercosur2009/mercosur09.htm
3)Another example,even 10-13 years ago,the engines are started to play on same level as Top Grandmasters
Note:in those years,mostly of the games are played at slow time controls,e.g in blitz the engines would be performed much stronger than Grandmasters
4)Rybka even without full pawn (on slow hardware,without strong book) performed stronger than Grandmasters
In other words,
-In 1990 years = GMs were stronger than Chess Engines (due to in those years the hardwares were very slow and the engines were much weaker)
-In the early of 2000 years = GMs were approx.equal to Chess Engines
-In 2003/2004/2005/2006 = Chess Engines performed approx. 190 Elo better than GMs
-Now we are in 2012 = We have much stronger MP engines (Houdini,Rybka,Critter,Ivanhoe,Fire,Stockfish...) and much faster hardwares than the past
And last:I strongly believe the Top MP Engines (using a strong book+latest fast i7 6 core machine) should be rated around 3300-3400 Elo
Hope this helps,
Best,
Sedat
-
- Posts: 3018
- Joined: Thu Mar 09, 2006 11:58 am
- Location: Antalya/Turkey
Re: CCRL live lists with 100 Elo reduction
Btw,my 2 cents more over this issue
There was also another duel in 2003:Garry Kasparov vs X3D Fritz (match ends 2-2)
http://en.wikipedia.org/wiki/X3D_Fritz
Note that X3D Fritz (Deep Fritz 8)'s hardware was four Intel Pentium 4 Xeon CPUs at 2.8 GHz
Deep Fritz8 is the new multi-processor version of Fritz, the same chess engine which drew
a match with Garry Kasparov last month ....
http://www.chessbase.com/newsdetail.asp?newsid=1375
According to CEGT, Deep Fritz 12 is 240 Elo stronger than Deep Fritz 8
http://www.husvankempen.de/nunn/40_40%2 ... liste.html
------------------------------------------------------------------------------------------------------------
Garry Kasparov vs Deep Junior - 2003 (ends in 3-3 draw)
Note that Deep Junior 2003 (the played version against Kasparov) is expecting to have same strenght as Deep Junir 8
According to SSDF,Deep Junior 12 is 214 Elo stronger than Deep Junior 8
http://ssdf.bosjo.net/list.htm
-------------------------------------------------------------------------------------------------------------
And now lets concentrate over SCCT Rating List:
http://www.sedatcanbaz.com/chess/scct-rating/
And i have a question to all,who dont agree with the current SCCT Elo Calculations:
-Would not be a big mistake,if we rate the above engines less than 3000 Elo (Deep Junior 13+Deep Fritz 12 on i7 980X @4.0GHz 6 core) ??
A little note more:i7 980X @4.0GHz 6 core is at least 150 Elo stronger than Intel Pentium 4 Xeon CPUs at 2.8 GHz
Greetings,
Sedat
There was also another duel in 2003:Garry Kasparov vs X3D Fritz (match ends 2-2)
http://en.wikipedia.org/wiki/X3D_Fritz
Note that X3D Fritz (Deep Fritz 8)'s hardware was four Intel Pentium 4 Xeon CPUs at 2.8 GHz
Deep Fritz8 is the new multi-processor version of Fritz, the same chess engine which drew
a match with Garry Kasparov last month ....
http://www.chessbase.com/newsdetail.asp?newsid=1375
According to CEGT, Deep Fritz 12 is 240 Elo stronger than Deep Fritz 8
http://www.husvankempen.de/nunn/40_40%2 ... liste.html
Code: Select all
no Program Elo + - Games Score Av.Op. Draws
117 Deep Fritz 12 2CPU 2823 14 14 1199 45.3% 2856 48.6%
489 Deep Fritz 8 2CPU 2583 11 11 2804 53.7% 2557 31.2%
------------------------------------------------------------------------------------------------------------
Garry Kasparov vs Deep Junior - 2003 (ends in 3-3 draw)
Note that Deep Junior 2003 (the played version against Kasparov) is expecting to have same strenght as Deep Junir 8
According to SSDF,Deep Junior 12 is 214 Elo stronger than Deep Junior 8
http://ssdf.bosjo.net/list.htm
Code: Select all
Rating + - Games Won Av.opp
10 Deep Junior 12 x64 2GB Q6600 2,4 GHz 3078 27 -25 778 68% 2948
7 Deep Junior 8 2GB Q6600 2,4 GHz 2864 26 -27 745 33% 2984
-------------------------------------------------------------------------------------------------------------
And now lets concentrate over SCCT Rating List:
http://www.sedatcanbaz.com/chess/scct-rating/
Code: Select all
Rank Name Elo + - games score oppo. draws
14 Deep Fritz 12 w32 6c 3138 23 23 554 41% 3194 41%
17 Deep Junior 13 x64 6c 3104 24 24 543 36% 3200 34%
And i have a question to all,who dont agree with the current SCCT Elo Calculations:
-Would not be a big mistake,if we rate the above engines less than 3000 Elo (Deep Junior 13+Deep Fritz 12 on i7 980X @4.0GHz 6 core) ??
A little note more:i7 980X @4.0GHz 6 core is at least 150 Elo stronger than Intel Pentium 4 Xeon CPUs at 2.8 GHz
Greetings,
Sedat
-
- Posts: 75
- Joined: Sun Jul 30, 2006 11:13 pm
- Location: Kalisz, Poland
Re: CCRL live lists with 100 Elo reduction
I don't understand those ELO reductions.
CEGT ratings were about -50 ELO comparing to CCRL. This difference was not too big.
Now CCRL ratings are 100 ELO lower and CEGT ratings 200 ELO lower, so the difference between CCRL and CEGT increased to 150 ELO.
IMO it would be better to select ONE very known engine with large amount of games, give it constant ELO and keep it the same on all ranking lists.
CEGT ratings were about -50 ELO comparing to CCRL. This difference was not too big.
Now CCRL ratings are 100 ELO lower and CEGT ratings 200 ELO lower, so the difference between CCRL and CEGT increased to 150 ELO.
IMO it would be better to select ONE very known engine with large amount of games, give it constant ELO and keep it the same on all ranking lists.
-
- Posts: 2564
- Joined: Thu Mar 09, 2006 3:04 am
Re: CCRL live lists with 100 Elo reduction
My suggestion would be for all the different ratings agencies to get together and agree on a constant ratings with one starting point, otherwise how do we know which agency rating system reflect the most accurate ratingsPiotr Cichy wrote:I don't understand those ELO reductions.
CEGT ratings were about -50 ELO comparing to CCRL. This difference was not too big.
Now CCRL ratings are 100 ELO lower and CEGT ratings 200 ELO lower, so the difference between CCRL and CEGT increased to 150 ELO.
IMO it would be better to select ONE very known engine with large amount of games, give it constant ELO and keep it the same on all ranking lists.