Thanks, I see now that I misunderstood so the speed difference is not 16x but rather 2.2x?Graham Banks wrote:That is not the actual hardware that we use.nimh wrote:The hardware that is used in creating CCRL lists is outdated by todays's standards, and time controls are ca 3x shorter.
We just used that machine as our original benchmark to find the adapted time controls to use on our computers.
For example, Nathanael's overclocked i7 Haswell quad uses 40/15, whereas an older Q6600 uses 40/32.
CCRL 40/40 equates to around 40/18 on modern hardware.
Hope that explains things.
a direct comparison of FIDE and CCRL rating systems
Moderators: hgm, Rebel, chrisw
-
- Posts: 46
- Joined: Sun Nov 30, 2014 12:06 am
Re: a direct comparison of FIDE and CCRL rating systems
-
- Posts: 1346
- Joined: Sat Apr 19, 2014 1:47 pm
Re: a direct comparison of FIDE and CCRL rating systems
Kai,
Let's assume the true strengh of Stockfish 7 is 3150.
What is the probabily of Magnus Carslen 2850 to win a game against it ?
Let's assume the true strengh of Stockfish 7 is 3150.
What is the probabily of Magnus Carslen 2850 to win a game against it ?
-
- Posts: 10948
- Joined: Wed Jul 26, 2006 10:21 pm
- Full name: Kai Laskos
Re: a direct comparison of FIDE and CCRL rating systems
Well, if FIDE rating in this case denotes some strength relationship between a human and an engine, it is predicted that in 10 games Carlsen will get 1.5 points (say one win and one draw, or more probably 3 draws). That's a bit more than I would think. There are also some issues, for example, by this curve, it is predicted that a 2800 GM stands a better chance against Stockfish 7 than Rybka 2.1 on identical hardware. Somehow I doubt this. There are other issues too, on the low ELO, for example, but I will post later.JJJ wrote:Kai,
Let's assume the true strengh of Stockfish 7 is 3150.
What is the probabily of Magnus Carslen 2850 to win a game against it ?
-
- Posts: 1346
- Joined: Sat Apr 19, 2014 1:47 pm
Re: a direct comparison of FIDE and CCRL rating systems
So, the funny thing to do could be to ask a ~2800 elo to play normal chess with draw odds against Stockfish or Komodo. Draw = win for human.
Problem could be to get 10 games as minimum.
Problem could be to get 10 games as minimum.
-
- Posts: 10948
- Joined: Wed Jul 26, 2006 10:21 pm
- Full name: Kai Laskos
Re: a direct comparison of FIDE and CCRL rating systems
I find your result interesting and intriguing. Up to now I have used the rule of thumb "CCRL computer ELO point is about 0.70 FIDE human ELO point". With CCRL and FIDE intersecting at the rating of 2800 for both. I plot here your result with my "rule of thumb" linear model:
We see the larger differences appear at CCRL ELO above 3500 and below 1800. And the models are pretty close otherwise. In the PDF you fit the "errors" with some power functions of ELO. Maybe if you do the fit with linear a*x+b, our results will look more similar? It is not clear from your plots that linear fit is so badly rejected, at least for FIDE ELO, although you might have observed a curvature there. I am a bit worried about your tails, for example a 2 ply engine is most likely below CCRL 1500, but most likely above FIDE 1000. Your plot seems to contradict this. Also, all the way from current top engines to perfect engine in FIDE ELO is worth less than 150 points. Isn't it a bit weird?
We see the larger differences appear at CCRL ELO above 3500 and below 1800. And the models are pretty close otherwise. In the PDF you fit the "errors" with some power functions of ELO. Maybe if you do the fit with linear a*x+b, our results will look more similar? It is not clear from your plots that linear fit is so badly rejected, at least for FIDE ELO, although you might have observed a curvature there. I am a bit worried about your tails, for example a 2 ply engine is most likely below CCRL 1500, but most likely above FIDE 1000. Your plot seems to contradict this. Also, all the way from current top engines to perfect engine in FIDE ELO is worth less than 150 points. Isn't it a bit weird?
-
- Posts: 46
- Joined: Sun Nov 30, 2014 12:06 am
Re: a direct comparison of FIDE and CCRL rating systems
Why do you assume that the relationship must be linear? I chose power functions because they had the best fit. It's not like I arbitralily picked them.
-
- Posts: 10948
- Joined: Wed Jul 26, 2006 10:21 pm
- Full name: Kai Laskos
Re: a direct comparison of FIDE and CCRL rating systems
Generally, linear model has one parameter less. Maybe you might need several more data-points for errors-ELO fits? I agree that my rule of thumb linear model is probably an over-simplification. And I guess nowadays top engines have less than a factor of 0.70 for gains, maybe 0.40-0.50, therefore the linear model breaks. That's why I find your result interesting.nimh wrote:Why do you asume that the relationship must be linear? I chose power functions because they had the best fit. It's not like I arbitralily picked them.
-
- Posts: 10948
- Joined: Wed Jul 26, 2006 10:21 pm
- Full name: Kai Laskos
Re: a direct comparison of FIDE and CCRL rating systems
Maybe 5 games, draw odds, top GM always white, engine no book, tournament time control. But I doubt a GM success even in this case.JJJ wrote:So, the funny thing to do could be to ask a ~2800 elo to play normal chess with draw odds against Stockfish or Komodo. Draw = win for human.
Problem could be to get 10 games as minimum.
-
- Posts: 803
- Joined: Sat Jan 31, 2015 11:50 pm
- Location: Philadelphia, USA
Re: a direct comparison of FIDE and CCRL rating systems
If I use the following formula for CCRL combined with the benched elapsed time of 17 from Crafty v19.17:Frank Quisinsky wrote:Hi Graham,
I think on i7 haswell 40 in 13/14 is more correct if I compare with my 40/10 haswell results the CCRL Ratings with 40 in 40.
Best
Frank
But one minute more or less ...
Not important.
T minutes / 40 moves repeated, where T = 40 * <elapsed seconds> / 48 = <elapsed seconds> / 1.2
CCRL 40/40: 40 * 17 = 680 / 48 = (14.16666666666667 / 1.2) = 11.80555555555556
I would equate 40 moves in 11 or 12 minutes.
I use 15 minutes because under load, the clock frequency dips 100MHz when all 4 cores are utilized.
Core_Engine_Tester_CCRL
-
- Posts: 46
- Joined: Sun Nov 30, 2014 12:06 am
Re: a direct comparison of FIDE and CCRL rating systems
I have compared FIDE and the accuracy of play on two more occasions.Laskos wrote:Generally, linear model has one parameter less. Maybe you might need several more data-points for errors-ELO fits? I agree that my rule of thumb linear model is probably an over-simplification. And I guess nowadays top engines have less than a factor of 0.70 for gains, maybe 0.40-0.50, therefore the linear model breaks. That's why I find your result interesting.nimh wrote:Why do you asume that the relationship must be linear? I chose power functions because they had the best fit. It's not like I arbitralily picked them.
http://www.chessanalysis.ee/summary450.pdf
http://www.chessanalysis.ee/Quality%20o ... suring.pdf
Both of them exhibit a kind of logarithmic relationship. Perhaps there are some games that exhibit a linear relationship between accuracy and either rating system, but chess certaintly isn't one of those.
It indeed seems weird, but I think it can be explained by the fact that my CPU is not strong enough to provide a reliable analysis of play by entities stronger than it. Unfortunately technology is not yet developed enough to tell what accuracy is needed to play at the level of 3500 FIDE and stronger.Also, all the way from current top engines to perfect engine in FIDE ELO is worth less than 150 points. Isn't it a bit weird?
The rule is that the whole package of CPU, engine and time per move used in analysis must absolutely surpass those of entities analyzed. That's why one cannot analyze contemporary correspondence games yet.