CCRL, FIDE and Ratings

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

User avatar
Ovyron
Posts: 4556
Joined: Tue Jul 03, 2007 4:30 am

Re: CCRL, FIDE and Ratings

Post by Ovyron »

carldaman wrote: Sun Oct 06, 2019 10:45 amDietrich Kappe may have attempted something like that with his BadGyal and EvilGyal nets, but they eventually became too strong for most people.

https://github.com/dkappe/leela-chess-w ... d-Networks
Something must have gone wrong then, the AI is supposed to hang a queen if it expects the human to hang a queen, it wouldn't even try to win games.
Laskos wrote: Sun Oct 06, 2019 10:50 am
Ovyron wrote: Sun Oct 06, 2019 9:17 am
At least, this worked for Super Mario Kart (where the AI trying to imitate the human outperformed the NN trying to win races, with a fraction of the effort. Even when the human wasn't an expert in the game.)
The goals are different, but is the first one SL and the second one zero approach RL solely, or the latter is SL + RL?
The unsuccessful one was like this, using a zero approach machine learning with a genetic algorithm that would try random inputs and would award how far they got in the level, so new generations generally improve but it's too slow.

The successful approach was this one, a recurrent neural network that doesn't even play the game at first, it is shown footage of the play by the human, and it tries to guess what will be the next input (even if the input is suboptimal.) The supervised learning had the problem that if the NN saw a situation it hadn't seen before, it'd break, so the human had to record more footage where he'd intentionally get into those situations, so the network could learn how to get out of them. And after it was done by just emulating the human it could drive and win the gold cup by itself (passing the turing test, as I couldn't really tell the difference between the human driving and the AI driving, which would fulfill BrendanJNorman's wish if it worked for chess.)
Your beliefs create your reality, so be careful what you wish for.
Vinvin
Posts: 5228
Joined: Thu Mar 09, 2006 9:40 am
Full name: Vincent Lejeune

Re: CCRL, FIDE and Ratings

Post by Vinvin »

About SSDF and FIDE :
SSDF lowered the rating values for its list several times :
https://www.chessprogramming.org/SSDF#R ... alibration
http://privat.bahnhof.se/wb432434/level.htm

This was done because they saw that the ratings for strongest computers is too high compare to human FIDE ratings.
But the best way is not to lower all computer ratings but the better way to compress computer's ratings.

A formula like : FideRating = (ComputerRating - 1600) * 0.85 + 1600

for a 1600 ComputerRating : (1600 - 1600) * 0.85 + 1600 = 1600 FideRating
for a 3000 ComputerRating : (3000 - 1600) * 0.85 + 1600 = 2790 FideRating
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: CCRL, FIDE and Ratings

Post by Laskos »

Vinvin wrote: Mon Oct 07, 2019 12:13 pm About SSDF and FIDE :
SSDF lowered the rating values for its list several times :
https://www.chessprogramming.org/SSDF#R ... alibration
http://privat.bahnhof.se/wb432434/level.htm

This was done because they saw that the ratings for strongest computers is too high compare to human FIDE ratings.
But the best way is not to lower all computer ratings but the better way to compress computer's ratings.

A formula like : FideRating = (ComputerRating - 1600) * 0.85 + 1600

for a 1600 ComputerRating : (1600 - 1600) * 0.85 + 1600 = 1600 FideRating
for a 3000 ComputerRating : (3000 - 1600) * 0.85 + 1600 = 2790 FideRating
There are differences between CCRL, CEGT, FGRL, SSDF rating calculations. CCRL is using Bayeselo which compresses by some 10% the ratings compared to CEGT and FGRL, which use Ordo (the more faithful to the Elo curve calculator). I never quite understood the SSDF global ratings, but one can download the main database and use Ordo on it.

When Ordo is used in computer ratings, the rating should be compressed by a factor of about 0.7 (might even be 0.65), not 0.85.