Increase in Elo ..Question For The Experts

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: Increase in Elo ..Question For The Experts

Post by Don »

Laskos wrote:
Don wrote: I'm almost positive that modern programs get a lot more ELO increase for doubling that the old program got.
That would mean that the difference in strength between modern programs and old ones would increase with the time control. I have to check the rating lists (no time now), but I doubt this statement.

Kai
Just so you understand what I'm saying, I assume that both program are starting from the same point on the ELO scale. Then you double them both and see which will benefit the most. Is that what you understood?
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Increase in Elo ..Question For The Experts

Post by Laskos »

Don wrote:
Laskos wrote:
Don wrote: I'm almost positive that modern programs get a lot more ELO increase for doubling that the old program got.
That would mean that the difference in strength between modern programs and old ones would increase with the time control. I have to check the rating lists (no time now), but I doubt this statement.

Kai
Just so you understand what I'm saying, I assume that both program are starting from the same point on the ELO scale. Then you double them both and see which will benefit the most. Is that what you understood?
Almost so, by comparing difference X between 2 engine at say 40/4 with difference Y between same engines at say 40/40. It's easy to do this having rating lists and a bit of time for a representative sample (errors are not so small in all these lists).

Kai
Steve B
Posts: 3697
Joined: Tue Jul 31, 2007 4:26 pm

Re: Increase in Elo ..Question For The Experts

Post by Steve B »

kasinp wrote:Steve, here are my thoughts:

1) SSDF shows ~ 51 Elo between G2 and G5.

It's easy to arrive at this once you notice that G3 and G4 are the same strength on 486 and P90 (their average ratings are off by 2 Elo). What remains is a direct comparison of G2 to G3-G4 on 486, and a direct comparison of G3-G4 to G5 on P90.

Note that SSDF lists G2 rating on 486 66MHz at 2234 (vs. 2350).
I am quite sure that I once read that SSDF ratings were increased by Wiki by 100 Elo to align them better with USCF. This could explain the bulk of this discrepancy.

2) Starting with 2234 Elo, and assuming 55 :wink: Elo per doubling, I would estimate G2 rating on PIII 866 MHz at:

2234 + 55*(LOG(850/33)/LOG(2)) = 2492 Elo (SSDF) or 2592 Elo (USCF).

G5 rating would be 51 Elo higher, as per point 1.
This formula uses a PIII 866MHz speed index for non-Linux systems, and a 486 50-66MHz speed index - both from the link I gave in the earlier post.

Hope this makes sense,
Peter
Excellent Peter
Thank you
i entirely forgot about the SSDF downgrading the old computer ratings by 100 Pts in the year 2000

i tested the WM (PIII 866) vs modern day engines(Res I and Res II) and their current SSDF ratings
i see the WM stronger then the RES I by about 240 Pts and slightly weaker then the Res II
basically about 2615
so if i subtract your 51 Pts it puts the WM Chess Genius 2 at 2564 SSDF
using your formula above we get 2492 starting from 2292
if we start with the 2350 (pre-devaluation SSDF Rating) we get about 2550

using the formulas provided by Bob and Matthias we should get an increase of at least 200 Pts over the 2350 486-66
this bring us to 2550+

all close enough for my purposes

ill go with an estimated rating range for the WM (PIII-866) G2 of 2550-2575

Best Regards
Steve
Uri Blass
Posts: 10281
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: Increase in Elo ..Question For The Experts

Post by Uri Blass »

Laskos wrote:
Don wrote: I'm almost positive that modern programs get a lot more ELO increase for doubling that the old program got.
That would mean that the difference in strength between modern programs and old ones would increase with the time control. I have to check the rating lists (no time now), but I doubt this statement.

Kai
That is not what Don means
Don means that if you start from the same elo modern programs earn more from doubling(I think that you can replace modern with significantly stronger)

I think that if you take 2 programs when the rating difference is more than 300 elo and give the weaker program significantly superior hardware to get result close to 50% at 5 minute per game then the stronger program is going to perform better at longer time control.

It is not about modern programs but about stronger programs and
I expect it to happen even if you test Crafty23.4 against Houdini1.5(difference of more than 300 elo rating points based on the CCRL list).

The first step is to find hardware difference that Crafty get 50% against houdini at 5 minutes per game and the second step is to test them at 1 hour per game with the same hardware difference.
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: Increase in Elo ..Question For The Experts

Post by Don »

Laskos wrote:
Don wrote:
Laskos wrote:
Don wrote: I'm almost positive that modern programs get a lot more ELO increase for doubling that the old program got.
That would mean that the difference in strength between modern programs and old ones would increase with the time control. I have to check the rating lists (no time now), but I doubt this statement.

Kai
Just so you understand what I'm saying, I assume that both program are starting from the same point on the ELO scale. Then you double them both and see which will benefit the most. Is that what you understood?
Almost so, by comparing difference X between 2 engine at say 40/4 with difference Y between same engines at say 40/40. It's easy to do this having rating lists and a bit of time for a representative sample (errors are not so small in all these lists).

Kai
My comment applies to older programs that are a several hundred ELO weaker and have large branching factors compared to modern programs. I don't believe there is a substantial difference when comparing program that are just 3 or 4 years older or that are hundreds of ELO weaker. In such a case I would agree with you if you are not starting with equalized programs.

A way to test such a thing is to start with an ancient version of Crafty for instance, and test it against Critter or Stockfish or Komodo. Add some time to Crafty and substract some time from the modern program to the extent that get a roughly even score against each other. Then increase the time for both programs by a constant factor and my hypothesis is that the modern program will be noticeably stronger.

If you compare what the old program does when doubled with what a modern program does when doubled it won't come out the same because the modern program may be starting out at 3000 and the old program may by starting out at 1800 and of course it's well known that doubling the time has a much larger impact on a program that is significantly weaker.

Don
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: Increase in Elo ..Question For The Experts

Post by Don »

Uri Blass wrote:
Laskos wrote:
Don wrote: I'm almost positive that modern programs get a lot more ELO increase for doubling that the old program got.
That would mean that the difference in strength between modern programs and old ones would increase with the time control. I have to check the rating lists (no time now), but I doubt this statement.

Kai
That is not what Don means
Don means that if you start from the same elo modern programs earn more from doubling(I think that you can replace modern with significantly stronger)

I think that if you take 2 programs when the rating difference is more than 300 elo and give the weaker program significantly superior hardware to get result close to 50% at 5 minute per game then the stronger program is going to perform better at longer time control.

It is not about modern programs but about stronger programs and
I expect it to happen even if you test Crafty23.4 against Houdini1.5(difference of more than 300 elo rating points based on the CCRL list).

The first step is to find hardware difference that Crafty get 50% against houdini at 5 minutes per game and the second step is to test them at 1 hour per game with the same hardware difference.
Yes, that is exactly what I'm saying. The branching factors of the really strong programs are something like 2 or even less. The branching factors of he older programs are something like 5, depending on the program. And yet searching an additional ply gives almost the same ELO gain, for example go from 4 to 5 ply and any program will benefit substantially. So in the same time it takes an old program to go from 4 to 5 ply a modern program will get to 6 ply or even a bit more.

I don't think modern program get quite the same benefit from doing an additional ply, but it's not that far off.

Don
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: Increase in Elo ..Question For The Experts

Post by Don »

I cannot seem to find older versions of Crafty, does anyone know where I can get a version that is at least 10 years old?

Don

Uri Blass wrote:
Laskos wrote:
Don wrote: I'm almost positive that modern programs get a lot more ELO increase for doubling that the old program got.
That would mean that the difference in strength between modern programs and old ones would increase with the time control. I have to check the rating lists (no time now), but I doubt this statement.

Kai
That is not what Don means
Don means that if you start from the same elo modern programs earn more from doubling(I think that you can replace modern with significantly stronger)

I think that if you take 2 programs when the rating difference is more than 300 elo and give the weaker program significantly superior hardware to get result close to 50% at 5 minute per game then the stronger program is going to perform better at longer time control.

It is not about modern programs but about stronger programs and
I expect it to happen even if you test Crafty23.4 against Houdini1.5(difference of more than 300 elo rating points based on the CCRL list).

The first step is to find hardware difference that Crafty get 50% against houdini at 5 minutes per game and the second step is to test them at 1 hour per game with the same hardware difference.
Roger Brown
Posts: 782
Joined: Wed Mar 08, 2006 9:22 pm

Re: Increase in Elo ..Question For The Experts

Post by Roger Brown »

Don wrote:I cannot seem to find older versions of Crafty, does anyone know where I can get a version that is at least 10 years old?


Hello Don,

Have a look here:

http://wbec-ridderkerk.nl/html/download.htm

Later.
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Increase in Elo ..Question For The Experts

Post by Laskos »

Don wrote:
Laskos wrote:
Don wrote:
Laskos wrote:
Don wrote: I'm almost positive that modern programs get a lot more ELO increase for doubling that the old program got.
That would mean that the difference in strength between modern programs and old ones would increase with the time control. I have to check the rating lists (no time now), but I doubt this statement.

Kai
Just so you understand what I'm saying, I assume that both program are starting from the same point on the ELO scale. Then you double them both and see which will benefit the most. Is that what you understood?
Almost so, by comparing difference X between 2 engine at say 40/4 with difference Y between same engines at say 40/40. It's easy to do this having rating lists and a bit of time for a representative sample (errors are not so small in all these lists).

Kai
My comment applies to older programs that are a several hundred ELO weaker and have large branching factors compared to modern programs. I don't believe there is a substantial difference when comparing program that are just 3 or 4 years older or that are hundreds of ELO weaker. In such a case I would agree with you if you are not starting with equalized programs.

A way to test such a thing is to start with an ancient version of Crafty for instance, and test it against Critter or Stockfish or Komodo. Add some time to Crafty and substract some time from the modern program to the extent that get a roughly even score against each other. Then increase the time for both programs by a constant factor and my hypothesis is that the modern program will be noticeably stronger.

If you compare what the old program does when doubled with what a modern program does when doubled it won't come out the same because the modern program may be starting out at 3000 and the old program may by starting out at 1800 and of course it's well known that doubling the time has a much larger impact on a program that is significantly weaker.

Don
I got what you meant, but I am not sure that would be a fair comparison. If one pits a strong program at 1 second with a weak program at 2 minutes (to equal the strength), then at 10 seconds and 20 minutes (10x time control), to compare the difference, and we see that the stronger program improves more, that could mean many things, hash filling or optimization for certain size of the tree, etc.

I could be wrong, but intuitively doubling the time close to the optimum of the engine is the same for weak or strong engines as Elo gain goes. The ply was more important as Elo gain for weaker, older engines having high branching factor. I would rather say that modern engines gain less from each new ply as compared to old (weak) engines. Besides that, the Elo gain with ply is diminishing with depth in the case of the modern engines.

One has to test to see what really happens.

Kai
Uri Blass
Posts: 10281
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: Increase in Elo ..Question For The Experts

Post by Uri Blass »

Don wrote:
Laskos wrote:
Don wrote:
Laskos wrote:
Don wrote: I'm almost positive that modern programs get a lot more ELO increase for doubling that the old program got.
That would mean that the difference in strength between modern programs and old ones would increase with the time control. I have to check the rating lists (no time now), but I doubt this statement.

Kai
Just so you understand what I'm saying, I assume that both program are starting from the same point on the ELO scale. Then you double them both and see which will benefit the most. Is that what you understood?
Almost so, by comparing difference X between 2 engine at say 40/4 with difference Y between same engines at say 40/40. It's easy to do this having rating lists and a bit of time for a representative sample (errors are not so small in all these lists).

Kai
My comment applies to older programs that are a several hundred ELO weaker and have large branching factors compared to modern programs. I don't believe there is a substantial difference when comparing program that are just 3 or 4 years older or that are hundreds of ELO weaker. In such a case I would agree with you if you are not starting with equalized programs.

A way to test such a thing is to start with an ancient version of Crafty for instance, and test it against Critter or Stockfish or Komodo. Add some time to Crafty and substract some time from the modern program to the extent that get a roughly even score against each other. Then increase the time for both programs by a constant factor and my hypothesis is that the modern program will be noticeably stronger.

If you compare what the old program does when doubled with what a modern program does when doubled it won't come out the same because the modern program may be starting out at 3000 and the old program may by starting out at 1800 and of course it's well known that doubling the time has a much larger impact on a program that is significantly weaker.

Don
I think that it is not only the search and I believe stronger programs usually earn more from doubling the time when they start from the same elo also from better evaluation.

Of course the difference is going to be bigger with an ancient version of Crafty.

Maybe Bob can try to test Crafty against stockfish with unequal conditions(so Crafty get 50% against Stockfish) and multiply the time control by 10 and test again to see the result.