a direct comparison of FIDE and CCRL rating systems

nimh · Post by **nimh** » Fri Feb 26, 2016 1:05 am

perhaps as he says due to overfitting or to the insufficient strength of the "judge" as you point out.

Why is using a logarithmic curve a case of overfitting, but a linear curve is not? To me it seems too arbitrary. I did not want to imply that the combination the engine and hardware is too weak to be used in analysis of top GM games. What I said is that I cannot make trustworthy extrapolations past 3100 ELO mark.

You missread my quote above. I said blunder 'wildly', not blunder in wild positions.
It also seems we have different definitions of 'blunder'. For me this must not even be a full pawn equivalent. It can also be a permanent weakness,
like spoiling the pawnshield on Kside or allowing an outpost without need or whatever...

Ok, my bad

But the analogy still is valid; just as there are a lot of low-elevation areas besides mountains, so are blunders accompanied by fairly accurate moves. It important that all data be taken into account when analyzing something.

The frequency is horrible. I have no idea why you don't believe me.
May I ask if you are a chess player too?

I think I have enough experience to know a thing or two about it. It's not about beliefs, I've seen the data with my own eyes. Just a hobby player with no official rating. But conducting analysis projects like this is even bigger a hobby and I take it more seriously.

There are much too less data points in CCRL at least in 40/40 for programs under 2100 to make any comparisons to Human ratings.

I did consider using 40/4 rating list as a basis, but ended up rejecting the idea, as I feared that it might have too little relevance with respect to modern computer chess.

BTW in that link you practically had it all the other way round for lower
CCRL ratings?

The methods used in that are by now outdated and not relevant any more.

I must say both tables are way off in most parts of it.
There is simply not enough data there for doing such a comparison.

How do you know that both tables are way off, if I may ask?
Theoretically expanding the range is not impossible. My 2014 database has only 3 games with both players within 1575-1625 range. Perhaps I could add games from 2013 and 2015 to widen the range a little so as to get enough games and also include games from 2013-2015 by players who got TPR of 3000-2900 in tournaments and matches. But that data would be incompatible with the rest of my data. I'm not sure if it would be a good idea.

thekingman · Post by **thekingman** » Fri Feb 26, 2016 9:58 am

Laskos wrote:
JJJ wrote:So, the funny thing to do could be to ask a ~2800 elo to play normal chess with draw odds against Stockfish or Komodo. Draw = win for human.

Problem could be to get 10 games as minimum.
Maybe 5 games, draw odds, top GM always white, engine no book, tournament time control. But I doubt a GM success even in this case.

Throw in during-game access to databases and tablebases, and instantly crown the GM the match winner if they, by some miracle, manage to outright win a single game, and you might be starting to get within sight of a competitive match.

Laskos · Post by **Laskos** » Sat Feb 27, 2016 5:06 am

thekingman wrote:
Laskos wrote:
JJJ wrote:So, the funny thing to do could be to ask a ~2800 elo to play normal chess with draw odds against Stockfish or Komodo. Draw = win for human.

Problem could be to get 10 games as minimum.
Maybe 5 games, draw odds, top GM always white, engine no book, tournament time control. But I doubt a GM success even in this case.
Throw in during-game access to databases and tablebases, and instantly crown the GM the match winner if they, by some miracle, manage to outright win a single game, and you might be starting to get within sight of a competitive match.

Right, but it would seem a very "prepared" match. One of the preferred odds to me is takeback odds. I remember Larry writing here that GM stands a chance only if every move could be taken back once until the next move is made. I don't know if a top GM would agree to such odds, on the other hand it seems to me even to be too much of an odds to give (but I might be wrong, Larry probably knows better). I tested take only 1 move back once until the next move is made. In case of a depth=1 move (very probably a smaller or a larger blunder) taken back, the 1 takeback odds gives about 60 ELO points advantage. So, it would seem that against Carlsen a fair odds would be 6-7 takebacks, against top GM 7-9 takebacks, if the odds are additive. But I don't know if a human blunder a human feels to take back is similar to depth=1 computer blunder, my test was between Stockfish and Texel. Maybe Larry could hint as to why he considers fair game "make as many takebacks as you wish" for a top GM against top engine.

CRoberson · Post by **CRoberson** » Sat Feb 27, 2016 9:53 pm

Recent real data:

Ares (under my ICC Telepath account) played 14 games at 15m+5 against IM Kanan Heydarli. His current FIDE rapid rating is 2256.

Ares won 12, drew 2 and no loses. Ares used 1 CPU of a i7 5820K OC'd to 4.5 GHz and 3 GB for transposition tables.

All this happened this week.

Ares also has 1 game (a win) with GM Smbat Lputian at 5m+1.

a direct comparison of FIDE and CCRL rating systems

Re: a direct comparison of FIDE and CCRL rating systems

Re: a direct comparison of FIDE and CCRL rating systems

Re: a direct comparison of FIDE and CCRL rating systems

Re: a direct comparison of FIDE and CCRL rating systems