CCRL 40/4 lists updated (11th August 2012)

Discussion of computer chess matches and engine tournaments.

Moderator: Ras

Modern Times
Posts: 3751
Joined: Thu Jun 07, 2012 11:02 pm

Re: CCRL 40/4 lists updated (11th August 2012)

Post by Modern Times »

lkaufman wrote:
It's easy to measure the relative speedup from using SSE for different engines. I don't have exact figures now, and it varied a bit over different versions, but maybe we get 7% or so more than others from it. Based on the CEGT ratings for Komodo 64 bit and 32 bit (six different versions), we average 1.3 elo for each percent speedup at the 40/4 level, so I guess SSE is close to ten elo. I still have to account for the remainder.
I'd give you 5 Elo max :) So small you can't really measure it. But who knows, you would have to run tens of thousands of games to find out. And CEGT certainly didn't do that for the six different versions.
Adam Hair
Posts: 3226
Joined: Wed May 06, 2009 10:31 pm
Location: Fuquay-Varina, North Carolina

Re: CCRL 40/4 lists updated (11th August 2012)

Post by Adam Hair »

lkaufman wrote:
Modern Times wrote:
lkaufman wrote: 1. I believe your cpu is pre-sse4. Since Komodo really suffers on non-sse4 machines (compared to other engines), that probably accounts for the bulk of the 20 elo. Do your other testers have sse4 machines or not?
You keep saying this, but where is the proof ? We found no difference with Komodo 4 SSE and Non-SSE at 40/40, and we ran hundreds of games with each. Incredibly wishful thinking to think SSE is worth nearly 20 Elo. What evidence do you have of that ?
It's easy to measure the relative speedup from using SSE for different engines. I don't have exact figures now, and it varied a bit over different versions, but maybe we get 7% or so more than others from it. Based on the CEGT ratings for Komodo 64 bit and 32 bit (six different versions), we average 1.3 elo for each percent speedup at the 40/4 level, so I guess SSE is close to ten elo. I still have to account for the remainder.
1.3 Elo per percent speedup seems a bit high at 40/4'. 0.7 to 1.0 seems to be more realistic based on tests I have done and data that Don has shared. I will not say you are wrong (though ratings list data may not be accurate enough to make assumptions, be it CCRL, CEGT, IPON, etc...), for I have not directly tested Komodo in this manner since v2.03. But, based on the fixed nodes, fixed depth, and time odds tests as well as studying Komodo's move selection as depth is increased, I would be surprised if an accurate measurement would find 1.3 Elo per percent speedup.
Uri Blass
Posts: 10896
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: CCRL 40/4 lists updated (11th August 2012)

Post by Uri Blass »

Adam Hair wrote:
lkaufman wrote:
Modern Times wrote:
lkaufman wrote: 1. I believe your cpu is pre-sse4. Since Komodo really suffers on non-sse4 machines (compared to other engines), that probably accounts for the bulk of the 20 elo. Do your other testers have sse4 machines or not?
You keep saying this, but where is the proof ? We found no difference with Komodo 4 SSE and Non-SSE at 40/40, and we ran hundreds of games with each. Incredibly wishful thinking to think SSE is worth nearly 20 Elo. What evidence do you have of that ?
It's easy to measure the relative speedup from using SSE for different engines. I don't have exact figures now, and it varied a bit over different versions, but maybe we get 7% or so more than others from it. Based on the CEGT ratings for Komodo 64 bit and 32 bit (six different versions), we average 1.3 elo for each percent speedup at the 40/4 level, so I guess SSE is close to ten elo. I still have to account for the remainder.
1.3 Elo per percent speedup seems a bit high at 40/4'. 0.7 to 1.0 seems to be more realistic based on tests I have done and data that Don has shared. I will not say you are wrong (though ratings list data may not be accurate enough to make assumptions, be it CCRL, CEGT, IPON, etc...), for I have not directly tested Komodo in this manner since v2.03. But, based on the fixed nodes, fixed depth, and time odds tests as well as studying Komodo's move selection as depth is increased, I would be surprised if an accurate measurement would find 1.3 Elo per percent speedup.
I do not know how much Komodo earns from 1% speed improvement
and I am only going to say that
1.3 elo per percent speedup is translated to something near 90 elo for being twice faster.
lkaufman
Posts: 6259
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA
Full name: Larry Kaufman

Re: CCRL 40/4 lists updated (11th August 2012)

Post by lkaufman »

Uri Blass wrote:
Adam Hair wrote:
lkaufman wrote:
Modern Times wrote:
lkaufman wrote: 1. I believe your cpu is pre-sse4. Since Komodo really suffers on non-sse4 machines (compared to other engines), that probably accounts for the bulk of the 20 elo. Do your other testers have sse4 machines or not?
You keep saying this, but where is the proof ? We found no difference with Komodo 4 SSE and Non-SSE at 40/40, and we ran hundreds of games with each. Incredibly wishful thinking to think SSE is worth nearly 20 Elo. What evidence do you have of that ?
It's easy to measure the relative speedup from using SSE for different engines. I don't have exact figures now, and it varied a bit over different versions, but maybe we get 7% or so more than others from it. Based on the CEGT ratings for Komodo 64 bit and 32 bit (six different versions), we average 1.3 elo for each percent speedup at the 40/4 level, so I guess SSE is close to ten elo. I still have to account for the remainder.
1.3 Elo per percent speedup seems a bit high at 40/4'. 0.7 to 1.0 seems to be more realistic based on tests I have done and data that Don has shared. I will not say you are wrong (though ratings list data may not be accurate enough to make assumptions, be it CCRL, CEGT, IPON, etc...), for I have not directly tested Komodo in this manner since v2.03. But, based on the fixed nodes, fixed depth, and time odds tests as well as studying Komodo's move selection as depth is increased, I would be surprised if an accurate measurement would find 1.3 Elo per percent speedup.
I do not know how much Komodo earns from 1% speed improvement
and I am only going to say that
1.3 elo per percent speedup is translated to something near 90 elo for being twice faster.
That's exactly how I got this. There are six Komodo versions with both 64 bit and 32 bit versions on CEGT blitz list. The average rating difference was exactly 90 elo, with a fairly modest variance and no particular trend. To the best of my recollection in all versions the 64 bit was extremely close to twice the speed of the 32 bit. So double speed = 90 elo, from which I got 1.3 elo per percent.
Adam Hair
Posts: 3226
Joined: Wed May 06, 2009 10:31 pm
Location: Fuquay-Varina, North Carolina

Re: CCRL 40/4 lists updated (11th August 2012)

Post by Adam Hair »

Uri Blass wrote:
Adam Hair wrote:
lkaufman wrote:
Modern Times wrote:
lkaufman wrote: 1. I believe your cpu is pre-sse4. Since Komodo really suffers on non-sse4 machines (compared to other engines), that probably accounts for the bulk of the 20 elo. Do your other testers have sse4 machines or not?
You keep saying this, but where is the proof ? We found no difference with Komodo 4 SSE and Non-SSE at 40/40, and we ran hundreds of games with each. Incredibly wishful thinking to think SSE is worth nearly 20 Elo. What evidence do you have of that ?
It's easy to measure the relative speedup from using SSE for different engines. I don't have exact figures now, and it varied a bit over different versions, but maybe we get 7% or so more than others from it. Based on the CEGT ratings for Komodo 64 bit and 32 bit (six different versions), we average 1.3 elo for each percent speedup at the 40/4 level, so I guess SSE is close to ten elo. I still have to account for the remainder.
1.3 Elo per percent speedup seems a bit high at 40/4'. 0.7 to 1.0 seems to be more realistic based on tests I have done and data that Don has shared. I will not say you are wrong (though ratings list data may not be accurate enough to make assumptions, be it CCRL, CEGT, IPON, etc...), for I have not directly tested Komodo in this manner since v2.03. But, based on the fixed nodes, fixed depth, and time odds tests as well as studying Komodo's move selection as depth is increased, I would be surprised if an accurate measurement would find 1.3 Elo per percent speedup.
I do not know how much Komodo earns from 1% speed improvement
and I am only going to say that
1.3 elo per percent speedup is translated to something near 90 elo for being twice faster.
Of course you and Larry are correct. If my mathematically skills keep eroding at this pace, I will be lucky if I can count the candles on my next birthday cake :oops:.

To correct myself, 90 Elo per doubling at 40/4' is not at odds with what I have seen. Anything from ~1 to 1.4 Elo per 1% increase in speed seems credible to me. I measured ~1.2 Elo for Gaviota.
Modern Times
Posts: 3751
Joined: Thu Jun 07, 2012 11:02 pm

Re: CCRL 40/4 lists updated (11th August 2012)

Post by Modern Times »

Adam Hair wrote: To correct myself, 90 Elo per doubling at 40/4' is not at odds with what I have seen. Anything from ~1 to 1.4 Elo per 1% increase in speed seems credible to me. I measured ~1.2 Elo for Gaviota.
And how much faster is the SSE version ? 7% x 1.2 = about 8 Elo.
But I don't believe it personally, from actual experience with Komodo 4, I don't see anything much at all.

Of course Intel vs AMD may be a factor. I am now of the view that Komodo *does* under-perform a little on AMD, and that may be why I am not seeing any increase. My testing is almost exclusively AMD (Phenom II mainly)
lkaufman
Posts: 6259
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA
Full name: Larry Kaufman

Re: CCRL 40/4 lists updated (11th August 2012)

Post by lkaufman »

Modern Times wrote:
Adam Hair wrote: To correct myself, 90 Elo per doubling at 40/4' is not at odds with what I have seen. Anything from ~1 to 1.4 Elo per 1% increase in speed seems credible to me. I measured ~1.2 Elo for Gaviota.
And how much faster is the SSE version ? 7% x 1.2 = about 8 Elo.
But I don't believe it personally, from actual experience with Komodo 4, I don't see anything much at all.

Of course Intel vs AMD may be a factor. I am now of the view that Komodo *does* under-perform a little on AMD, and that may be why I am not seeing any increase. My testing is almost exclusively AMD (Phenom II mainly)
It might be interesting to see what the blitz lists look like if you could separate AMD from Intel. I don't know how hard that would be for you. We haven't yet confirmed an AMD/Intel difference in our own testing, strictly on SSE4 machines. It wouldn't help explain our discrepancy, because our own distributed testing is now a mixture of AMD and Intel, just like yours. Maybe it's SSE (7% x 1.3 = 9 elo) + a few elo from differing books + a couple elo from tablebases + a few elo from sample error.
User avatar
geots
Posts: 4790
Joined: Sat Mar 11, 2006 12:42 am

Re: CCRL 40/4 lists updated (11th August 2012)

Post by geots »

Adam Hair wrote:
lkaufman wrote:
Adam Hair wrote:Here are all of the relevant details that I can think of:

OS: Windows XP 64-bit
CPU: Intel QX6700 at 3.05 GHz
Time Control: 40/3'
GUI: cutechess-cli
Hash: 128 MB
EGTB: None
Starting Positions: PGN of ~17,900 positions 4 moves deep
Resign: off
Draws: game adjudicated as a draw if both engines' score is within 50 centipawns after 250 moves. I do not remember if cutechess uses the 50 moves rule (I think it does).



lkaufman wrote: Two comments:


1. I believe your cpu is pre-sse4. Since Komodo really suffers on non-sse4 machines (compared to other engines), that probably accounts for the bulk of the 20 elo. Do your other testers have sse4 machines or not?
Yes, two 40/4 testers have SSE4 CPUs. And our results for Komodo 4 showed no measurable difference between non-SSE4 and SSE4. Though, if we played 20,000 games, it is possible that a statistically significant difference would be found.
lkaufman wrote: 2. We learned that it is very important for testers to use the 50 move rule. If they do not, engines may make ridiculous moves when they think the 50 move rule is about to apply. You should verify that it does use the 50 move rule and switch if it does not.

Thanks for your answers and your testing!
I have confirmed that cutechess does use the 50 move limit. I was 99% certain before; now I am 100% certain since at least 1 game was adjudicated as a draw because of the 50 move limit.

With Ilari's post, I am 110% certain :)



Anyone who thinks SSE would add 20 elo better take a long look in the mirror. It would be next to impossible for it to ever add a double-digit elo gain. I'm thinking 3 or 4 elo tops, maybe an extreme case where it had a 6 elo gain- but 10 to 20. Either pure bullshit, or someone is chasing rainbows- you pick.


george


PS: One other thing everyone should keep in mind. If you are beta testing an engine for a future release, at least 50% of your testing should be at time controls that the most prominent testing groups use. Beta testing with no "repeating" time controls, then seeing it rated with nothing BUT repeating controls will make more of an elo difference than SSE could ever think of making. (This whole set of threads is a long journey to nowhere!)
Uri Blass
Posts: 10896
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: CCRL 40/4 lists updated (11th August 2012)

Post by Uri Blass »

geots wrote:
Adam Hair wrote:
lkaufman wrote:
Adam Hair wrote:Here are all of the relevant details that I can think of:

OS: Windows XP 64-bit
CPU: Intel QX6700 at 3.05 GHz
Time Control: 40/3'
GUI: cutechess-cli
Hash: 128 MB
EGTB: None
Starting Positions: PGN of ~17,900 positions 4 moves deep
Resign: off
Draws: game adjudicated as a draw if both engines' score is within 50 centipawns after 250 moves. I do not remember if cutechess uses the 50 moves rule (I think it does).



lkaufman wrote: Two comments:


1. I believe your cpu is pre-sse4. Since Komodo really suffers on non-sse4 machines (compared to other engines), that probably accounts for the bulk of the 20 elo. Do your other testers have sse4 machines or not?
Yes, two 40/4 testers have SSE4 CPUs. And our results for Komodo 4 showed no measurable difference between non-SSE4 and SSE4. Though, if we played 20,000 games, it is possible that a statistically significant difference would be found.
lkaufman wrote: 2. We learned that it is very important for testers to use the 50 move rule. If they do not, engines may make ridiculous moves when they think the 50 move rule is about to apply. You should verify that it does use the 50 move rule and switch if it does not.

Thanks for your answers and your testing!
I have confirmed that cutechess does use the 50 move limit. I was 99% certain before; now I am 100% certain since at least 1 game was adjudicated as a draw because of the 50 move limit.

With Ilari's post, I am 110% certain :)



Anyone who thinks SSE would add 20 elo better take a long look in the mirror. It would be next to impossible for it to ever add a double-digit elo gain. I'm thinking 3 or 4 elo tops, maybe an extreme case where it had a 6 elo gain- but 10 to 20. Either pure bullshit, or someone is chasing rainbows- you pick.


george


PS: One other thing everyone should keep in mind. If you are beta testing an engine for a future release, at least 50% of your testing should be at time controls that the most prominent testing groups use. Beta testing with no "repeating" time controls, then seeing it rated with nothing BUT repeating controls will make more of an elo difference than SSE could ever think of making. (This whole set of threads is a long journey to nowhere!)
I do not understand your confidence that 10 elo difference is impossible.

CCRL did not play enough games for Komodo to have a statistical error that is lower than 10 elo so the fact that they see no difference between SSE and not SSE proves nothing.
Uri Blass
Posts: 10896
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: CCRL 40/4 lists updated (11th August 2012)

Post by Uri Blass »

Modern Times wrote:
Adam Hair wrote: To correct myself, 90 Elo per doubling at 40/4' is not at odds with what I have seen. Anything from ~1 to 1.4 Elo per 1% increase in speed seems credible to me. I measured ~1.2 Elo for Gaviota.
And how much faster is the SSE version ? 7% x 1.2 = about 8 Elo.
But I don't believe it personally, from actual experience with Komodo 4, I don't see anything much at all.

Of course Intel vs AMD may be a factor. I am now of the view that Komodo *does* under-perform a little on AMD, and that may be why I am not seeing any increase. My testing is almost exclusively AMD (Phenom II mainly)
I think that the statistical error is too high to measure difference of 9 or 10 elo between SSE and not SSE for komodo including Komodo4 even if we ignore the Intel vs AMD.