Dragon 2.6.1 Elo levels

Discussion of anything and everything relating to chess playing software and machines.

Moderator: Ras

lkaufman
Posts: 6223
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA
Full name: Larry Kaufman

Re: Dragon 2.6.1 Elo levels

Post by lkaufman »

Cornfed wrote: Sun Jan 30, 2022 6:09 pm
lkaufman wrote: Sun Jan 30, 2022 6:06 am
Cornfed wrote: Sun Jan 30, 2022 3:54 am
Chessqueen wrote: Sat Jan 29, 2022 3:54 pm
Odd Gunnar Malin wrote: Sat Jan 29, 2022 2:21 pm Hi.
I see these discussion drift away allready on first reply, but that ok with me for now. I have started some training sessions to get prepared for this summers nation championship, I play in the 50+ group, and have little time to do much more on my freetime.
Anyhow, my current rating with Hiarcs as GUI and Dragon 2.6.1 in 15+10 games is 1634 (21 games). I let it select the rating automatic, eg. it set it to my current rating. I play those game as against a real opponent without any cheat with paperbook in hand or any other information. The book I use is based on games by players I will meet in the next tournament and I don't study the book before the game. Of course I study the line played in the game after a game, that why I create those book, to learn those opening I will meet.
I will still play these game against Dragon. If you want, I can put them up on Lichess (study) or chess.com (library) and share those.

Edit: Forgot to mention that I have created a little utility to slow down Dragon in a somewhat intelligent way. Eg. this I spoke about in an earlier thread.
Good Luck with your training, and once you figured out how to beat Komodo Dragon 2.6.1 at 1650 UCI_Elo raise it to1700.

I am no programmer, but as different playing sites ratings vary so much and so many play online these day...and OTB is more of a 'gold standard' because we try harder...the cat or wife doesn't interrupt us, etc...would it ever be possible to let Dragon analyze say, 50 games of an individual and let itself determine a suitable level of play to let me theoretically get close to a 50/50 result? It would seem to work for either OTB at long time controls or even on line at 15min/10sec...or whatever.

Lets say Dragon - based on a good set of my games (I'm just saying 50 as an example) deemed me a 2050 player. I could use that and be more assured of a 50/50 result - no 'adjusting on the fly, game to game as I think Fritz tries to do...if I wanted to 'play up', I could set it to 2200 and still have my chances on a given day.
It is certainly possible to develop a version of Dragon that would review a file of games and (for the time limit at which the games were played) give an estimated rating. We don't currently have good data on the average error rate at multiple time limits for players of various ratings (which of course in turn depends on whether we are talking about OTB FIDE ratings or online game scores with online ratings), but there is no problem in principle, it just takes a lot of work. But I think we'll soon be able to say with some precision that if your FIDE rating is X or your chess.com Rapid rating is Y or your lichess rapid rating is Z (given enough games to be valid), then do this simple calculation to determine a fair setting for Dragon Elo. If you only have an online rating and it is unrealistic due to interruptions or playing drunk or whatever, you just need to play 50 or so games under proper conditions to see what your real level is. Most likely this would be more accurate than a rating based on reviewing games, although I would very much like to be able to estimate ratings at various time limits from game scores. Then we could answer questions such as "Does Magnus Carlsen play better Rapid chess than Botvinnik or Tal played Classical chess?", or "What Classical Time limit Elo rating today would be of equal quality to Hikaru playing blitz?" or "Who would win a match between Ben Finegold and Paul Morphy?".
Yes, you get the idea! I like it because it is tailor made to the individual and you the individual does not have to try to arrive at different ratings at different TC's one game at a time.

Off my head I see possible issues (perhaps just phantoms...) with your approach. Perhaps items which have entered your head as well.

Online: almost all my games (most everyone's really) online are blitz. While I've beaten players as highly rated as 2700 on chess.com, I can certainly lose to people lower than myself. Truthfully, my setting allows only games for people <50 pts and with no upper limit as that is more testing so I lose more than I normally would if I played a wider range of players.

My local club has their monthly G30+5 today on lichess. I quit playing in those over a year ago because - with all that time...people still tend to blitz their moves out, blunder and it all became a pointless waste of an evening for me. Of course, it is hard for everyone to maintain their focus in longer games 'online'...especially when it's 'for fun'/no money involved. My 'Classical' (25 min +) rating there was 2258 and rising before I quit. Rapid?, similar but only after 19 games. On chess.com my rating (forget what it is - 2190 I think) benefited from various opponents having been caught cheating and because they gift you rating points as if you had won (as I recall), my rating there is probably well north of what it should be - certainly is in "daily" (Corresepondence) where I've been gift hundreds of rating points from their cheat detection.

I know comparatively few people have ever played enough FIDE rated games. I've played USCF tournament chess for right at 50 yrs now and despite hundreds of tournaments, have never been able to play in a FIDE rated event...a World Open, but that wasn't too many games. I wonder if the same holds true in other countries around the world and those which have their own ratings system? Perhaps though there is a good formula for each to match those individuals to a FIDE rating...but not likely among various time controls (?). The process just seems so convoluted with a lot of issues in comparing ratings which are iffy site to site anyway.

Yes, being able to have Dragon point everyone to some 'truth' as to how different players of different generations might fare against others would be unique and great fodder for message boards indeed. Going hand in hand with being able to feed it 50 or so games of ones own and let it set an internal level equal to ones level would certainly be something Team Stockfish would never offer. :wink: Their only reason for being is to pursue elo.
There is a fairly accurate formula for converting between USCF and FIDE ratings. I forget right now where the formula can be seen, but I recall that around 2200 level the gap is roughly 60 elo, so subtracting 60 elo from your USCF rating should be close enough as an estimated FIDE rating for you. In the 2100 to 2200 range it seems that chess.com Rapid ratings (if you play enough games) are a close match to FIDE ratings on average, so that's another way for you to get an estimate.

The biggest problem for estimating elo from game scores is dealing with the fact that some games are just easy boring ones where no big errors are likely, whereas others are extremely complicated where big errors are likely. This is not an impossible problem to deal with I think, but far from trivial.
Komodo rules!
Cornfed
Posts: 511
Joined: Sun Apr 26, 2020 11:40 pm
Full name: Brian D. Smith

Re: Dragon 2.6.1 Elo levels

Post by Cornfed »

lkaufman wrote: Sun Jan 30, 2022 7:47 pm
There is a fairly accurate formula for converting between USCF and FIDE ratings. I forget right now where the formula can be seen, but I recall that around 2200 level the gap is roughly 60 elo, so subtracting 60 elo from your USCF rating should be close enough as an estimated FIDE rating for you. In the 2100 to 2200 range it seems that chess.com Rapid ratings (if you play enough games) are a close match to FIDE ratings on average, so that's another way for you to get an estimate.

The biggest problem for estimating elo from game scores is dealing with the fact that some games are just easy boring ones where no big errors are likely, whereas others are extremely complicated where big errors are likely. This is not an impossible problem to deal with I think, but far from trivial.
Of course there is a correlation between 'big errors' and rating; but for someone like myself who spent 20 yrs largely backing into specific English set-ups after 1.g3, 'our game' is far less exposed to those 'big errors'...at least tactical ones and resulting positional play is pretty simple and often recurring. So, I understand what you are saying about these types of games being a 'problem' for an engine to derive someone's rating when we opt for a more 'do no harm'/keep pushing approach compared to those who play more loosely and enter into more tactical play.

Good luck with what you are trying to do!