Elo points gain from doubling time

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

Jouni
Posts: 3283
Joined: Wed Mar 08, 2006 8:15 pm

Re: Elo points gain from doubling time

Post by Jouni »

So these Houdini tests indicate, that depth based tests are quite useless. Only thinking time is important. But it's stunning, that You still get same 70+ points from doubling as 20 years ago, when engines were 1000 points weaker :!: :!:
Jouni
User avatar
hgm
Posts: 27790
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Elo points gain from doubling time

Post by hgm »

Don wrote:
hgm wrote:The result is weird anyway: the draw fraction seems to go up enormously at higher depth. I would consider 80% draws a ridiculously high draw fraction, between nearly equal engines.
I don't think it is - the number of draws goes up pretty steadily with the quality of the players. I think in a few years we are going to see the ratings of the top programs get really compressed due to this. You will play a 100 games match and they will be mostly draws. It will be like checkers is now.
The Elo model is based on the implicit assumption that results are only dependent on the rating difference. If the situation gets drastically different at high ratings than at low ratings, it pretty much loses its meaning. In particular, if the draw rate would go up and approach 100%, expressing this as a low Elo difference is very misleading. You could have a situation where the draw rate is 90%, but the win rate 9.9% and the loss rate 0.1%. Which is very different from a situation where you would have 88% draw, 10.9% win and 1.1% loss probablility. Both would have a 55% average score, and thus ~ 35 Elo difference, but in the latter situation the weaker player would have approximately 10 times as much chance to beat the strong one in a match over 10 games.
User avatar
Rebel
Posts: 6991
Joined: Thu Aug 18, 2011 12:04 pm

Re: Elo points gain from doubling time

Post by Rebel »

Jouni wrote:So these Houdini tests indicate, that depth based tests are quite useless. Only thinking time is important. But it's stunning, that You still get same 70+ points from doubling as 20 years ago, when engines were 1000 points weaker :!: :!:
20 years ago engines had a branch factor of 3-4. Nowadays that's 1½-2. That's the explanation why a doubling in speed still pays off so big.
User avatar
Rebel
Posts: 6991
Joined: Thu Aug 18, 2011 12:04 pm

Re: Elo points gain from doubling time

Post by Rebel »

BubbaTough wrote:
Don wrote: Another problem I have identified is the contempt factor. I am of the opinion that the relative strength of the opponent should be communicated to the engines and built it to the protocol because this is becoming an important issue too. It can be worth 100 ELO or more if you are playing way up or down by several hundred ELO (I studied this too.) Unless that is handled a perfect player will draw far too many games against weak opponents. In human play it's rare not to have a rough idea of the strength of your opponent. That can be communicated via some user defined contempt factor but I would like to see it built in to the protocol.
Don
I think the big problem is not building strength of opponent information into the protocol (in fact I think there is support for this using the UCI_Opponent option) its getting people to use it. If you play on ICC you can get this information automatically, but for things like testing and rating groups it is a long hill to climb to convince the community to support that.

-Sam
I am in full agreement, computer chess is immature in this area. As a chess player you have the right to know the elo of your opponent. What you currently (without the elo information) can do is to keep track of the score in an eng-eng match and depending on the match score modify your contempt factor or playing style. However I don't know how such tricks will be received by the CC users & programmers. Perhaps a poll is in place?

I don't do such tricks and learning is disabled by default because those are the historic unwritten rules. The question is if the CC community is ready for a change.

Personally I like such a new challenge.
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Elo points gain from doubling time

Post by Laskos »

lkaufman wrote:
Laskos wrote:
lkaufman wrote:
No data at 40/2, but quite good data at the CCRL time limit of 40/40', exactly ten times slower than their blitz level. My method is simply to compare the ratings for Komodo 64 bit with Komodo 32 bit, since with every version the 64 bit was identical but ran almost exactly twice as fast, perfect for this study! There are seven Komodo versions with 40/40 ratings for both 64 bit and 32 bit (eight if you count the predecessor Doch), and the average elo difference was 71 (72 counting Doch). Moreover there was surprisingly little spread, every value came out in the range 62 to 82! What does your formula predict for this, allowing for the hardware adjustment and also bearing in mind that the 32 bit data corresponds to half the stated time control? If you want to get real fancy, you can try to allow for the progression in elo of the Komodo versions; the 64 bit values (starting with the Doch value) were 2888, 2950, 2981, 3015, 3098, 3142, 3148, and 3158.
So in round numbers, my conclusion is that going from 3" to 6" per move on old hardware (so 1.5" to 3" on your hardware) was worth 90 elo for the average Komodo version, and going from 30" to 60" on old hardware (15" to 30" on your hardware) was worth 70 elo.
So, I have to account for my hardware being 2 times faster, and 32 bit is 2 times slower, therefore for a total factor of 4 compared to 6'' and 60'' CCRL respectively, correct? In this case my rule of thumb formula gives 98 and 62 Elo points respectively. Pretty close to your empirical 90 and 70, a bit steeper slope in my case, but well within our error margins.
Assuming that your formula is based on the time before the doubling, that is correct. So yes, we agree reasonably well. If we split the difference and round off, we can say that doubling for Komodo is worth 95 at bullet speeds on good hardware and 65 at 15" per move on good hardware.
My test was based on some 2,000-4,000 game matches, probably you have more data from the lists. I would adjust my rule of thumb formula based on the time before the doubling to fit better your numbers:

100*(time per move in seconds on a modern core)^(-0.15) points gain from doubling. This would give 94 and respectively 67 points. For 40/2hours on a modern core it would give 46 points.
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: Elo points gain from doubling time

Post by Don »

hgm wrote:
Don wrote:
hgm wrote:The result is weird anyway: the draw fraction seems to go up enormously at higher depth. I would consider 80% draws a ridiculously high draw fraction, between nearly equal engines.
I don't think it is - the number of draws goes up pretty steadily with the quality of the players. I think in a few years we are going to see the ratings of the top programs get really compressed due to this. You will play a 100 games match and they will be mostly draws. It will be like checkers is now.
The Elo model is based on the implicit assumption that results are only dependent on the rating difference. If the situation gets drastically different at high ratings than at low ratings, it pretty much loses its meaning. In particular, if the draw rate would go up and approach 100%, expressing this as a low Elo difference is very misleading. You could have a situation where the draw rate is 90%, but the win rate 9.9% and the loss rate 0.1%. Which is very different from a situation where you would have 88% draw, 10.9% win and 1.1% loss probablility. Both would have a 55% average score, and thus ~ 35 Elo difference, but in the latter situation the weaker player would have approximately 10 times as much chance to beat the strong one in a match over 10 games.
Yes, but I don't know how to interpret that. If you can only score 55% the rating would still reflect that, even if in some way you are highly superior and you opponent must get his 45% with draws.

The issue is that ELO doesn't express chess skill at the lowest levels but it does measure your performance.

I can imagine a super-komodo that has improved significantly and is running on a computer 100x faster - and rarely loses a game even to a perfect player - and yet the perfect player in some abstract sense is vastly superior as a player.
Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.
User avatar
hgm
Posts: 27790
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Elo points gain from doubling time

Post by hgm »

One could also say that the challenge simply is not big enough to resolve the difference. When you want to compare two nearly-perfect engines, starting them from positions in the center of a broad draw zone makes an insensitive test. Most games will be draws, and the time spent to play them wasted.

OTOH, if you would start from a set of positions all very close to the win-draw and loss-draw boundary, you might see a huge difference.

(In Chess the possibility of draws complicates this a bit; in a game that has no draws, there is only one game-theoretical boundary, and it would be intuitively obvious that you have to start from a position near that boundary. No one would attempt to measure Elo by starting from a position with Queen Odds. Because that would be so far from the win-loss boundary that even at the level of TSCP, the outcome would be a 100% certainty, telling you absolutely nothing about the quality of the players.)
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Elo points gain from doubling time

Post by Laskos »

hgm wrote:
Don wrote:
hgm wrote:The result is weird anyway: the draw fraction seems to go up enormously at higher depth. I would consider 80% draws a ridiculously high draw fraction, between nearly equal engines.
I don't think it is - the number of draws goes up pretty steadily with the quality of the players. I think in a few years we are going to see the ratings of the top programs get really compressed due to this. You will play a 100 games match and they will be mostly draws. It will be like checkers is now.
The Elo model is based on the implicit assumption that results are only dependent on the rating difference. If the situation gets drastically different at high ratings than at low ratings, it pretty much loses its meaning. In particular, if the draw rate would go up and approach 100%, expressing this as a low Elo difference is very misleading. You could have a situation where the draw rate is 90%, but the win rate 9.9% and the loss rate 0.1%. Which is very different from a situation where you would have 88% draw, 10.9% win and 1.1% loss probablility. Both would have a 55% average score, and thus ~ 35 Elo difference, but in the latter situation the weaker player would have approximately 10 times as much chance to beat the strong one in a match over 10 games.
That's an interesting remark. Maybe in these matches with some 95% draws LOS in 100 or 1000 games is more meaningful, and LOS matrix for several engines.
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: Elo points gain from doubling time

Post by Don »

hgm wrote:One could also say that the challenge simply is not big enough to resolve the difference. When you want to compare two nearly-perfect engines, starting them from positions in the center of a broad draw zone makes an insensitive test. Most games will be draws, and the time spent to play them wasted.

OTOH, if you would start from a set of positions all very close to the win-draw and loss-draw boundary, you might see a huge difference.

(In Chess the possibility of draws complicates this a bit; in a game that has no draws, there is only one game-theoretical boundary, and it would be intuitively obvious that you have to start from a position near that boundary. No one would attempt to measure Elo by starting from a position with Queen Odds. Because that would be so far from the win-loss boundary that even at the level of TSCP, the outcome would be a 100% certainty, telling you absolutely nothing about the quality of the players.)
You could collect these positions statistically. If I tracked the results of each opening I could cull away the drawish positions. I have never analyzed this opening by opening so I don't know if I could expect to see a big difference or not. We have 7 sets of openings and 1 endgame set, some variable depth and others shallow. We usually use our shallow big set that goes exactly 10 ply with 35,533 positions so it would take a very long time to get a lot of statistics on each one but it might be interesting.

I would not want to do this with positions that are actually wins or losses - that it too artificial - and of course it's virtually impossible to determine that. But I would not mind having positions that are difficult to hold to a draw.

We have just a few gambits build in to our set too but I think gambits tend to be more drawish because most of them turn a real advantage into an equal game.

Another possibility is some sort of odds which is really the same as starting from an unbalanced position. You could take away castling rights or do other things to force some imbalance. Unfortunately, to play tens of thousands of games you need some serious variety and it would be a challenge producing an opening book for this usage.

You can make the programs produce their own variety but this degrades the quality of the moves pretty significantly and doesn't actually provide the right type of variety unless the randomness is pretty severe.

The problem with computer checkers of course is a lot more severe. I found this on the web from 2004:

Cake won a 624-game computer checkers match against Kingsrow by a score of +3 -1 = 620.

I think Cake is one of the top checkers programs and there were only 4 decisive games in a 624 game match if I am reading this correctly.

Don
Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.
Sven
Posts: 4052
Joined: Thu May 15, 2008 9:57 pm
Location: Berlin, Germany
Full name: Sven Schüle

Re: Elo points gain from doubling time

Post by Sven »

hgm wrote:
Don wrote:
hgm wrote:The result is weird anyway: the draw fraction seems to go up enormously at higher depth. I would consider 80% draws a ridiculously high draw fraction, between nearly equal engines.
I don't think it is - the number of draws goes up pretty steadily with the quality of the players. I think in a few years we are going to see the ratings of the top programs get really compressed due to this. You will play a 100 games match and they will be mostly draws. It will be like checkers is now.
The Elo model is based on the implicit assumption that results are only dependent on the rating difference. If the situation gets drastically different at high ratings than at low ratings, it pretty much loses its meaning. In particular, if the draw rate would go up and approach 100%, expressing this as a low Elo difference is very misleading.
Why is this misleading? Close to 100% draws means both players are about equal, i.e. have a low rating difference. What's wrong with that?
hgm wrote:You could have a situation where the draw rate is 90%, but the win rate 9.9% and the loss rate 0.1%. Which is very different from a situation where you would have 88% draw, 10.9% win and 1.1% loss probablility. Both would have a 55% average score, and thus ~ 35 Elo difference,
Both situations are very similar, and you should not construct a relation between the number of wins and the number of losses when, as in the example you gave, the majority of games ends with a draw. That's a very artificial relation because of the small numbers involved. In fact 9.9% wins and 10.9% wins are very close to each other. That's 20 out of 1000 games which either all end with a draw (in the 90% case) or with 10 more wins and 10 more losses (in the 88% case). 980 other games remain untouched.
hgm wrote:but in the latter situation the weaker player would have approximately 10 times as much chance to beat the strong one in a match over 10 games.
I don't understand that part. Do you mean "beat" in the sense of winning one of the games, or "beat" as in winning the match? The latter I won't believe until you show it. Perhaps you are talking about some 0.0x% vs. 0.00x%?

Sven