Jumbo 0.6.96 64-bit Gauntlet for CCRL 40/40
Moderators: hgm, Rebel, chrisw
-
- Posts: 41455
- Joined: Sun Feb 26, 2006 10:52 am
- Location: Auckland, NZ
Jumbo 0.6.96 64-bit Gauntlet for CCRL 40/40
gbanksnz at gmail.com
-
- Posts: 41455
- Joined: Sun Feb 26, 2006 10:52 am
- Location: Auckland, NZ
Re: Jumbo 0.6.96 64-bit Gauntlet for CCRL 40/40
Code: Select all
CCRL 40/40 Rating List - Custom engine selection
1009555 games played by 2389 programs, run by 22 testers
Ponder off, General books (up to 12 moves), 3-4-5 piece EGTB
Time control: Equivalent to 40 moves in 40 minutes on Athlon 64 X2 4600+ (2.4 GHz), about 15 minutes on a modern Intel CPU.
Computed on March 23, 2019 with Bayeselo based on 1'009'555 games
Tested by CCRL team, 2005-2019, http://computerchess.org.uk/ccrl/4040/
Rank Engine Elo + - Score AvOp Games
1 Jumbo 0.6.51 64-bit 2492 +28 -28 50.4% -2.4 416
Jumbo 0.6.66 64-bit 2490 +17 -17 51.1% -6.6 1172
Jumbo 0.6.10 64-bit 2483 +24 -24 52.5% -19.4 585
Jumbo 0.6.35 64-bit 2478 +20 -20 49.2% +4.9 942
Jumbo 0.6.96 64-bit 2474 +27 -27 46.6% +25.8 460
Jumbo 0.6.31 64-bit 2469 +25 -25 47.3% +18.9 566
Jumbo 0.4.34 64-bit 2421 +25 -25 53.0% -22.3 532
Jumbo 0.5.3 64-bit 2369 +28 -28 45.4% +33.7 457
Jumbo 0.4.17 64-bit 2342 +28 -27 51.9% -13.5 462
Jumbo 0.4.0 64-bit 2276 +35 -35 49.7% +2.4 302
gbanksnz at gmail.com
-
- Posts: 919
- Joined: Tue Nov 24, 2015 9:11 pm
- Location: upstate
Re: Jumbo 0.6.96 64-bit Gauntlet for CCRL 40/40
Tirsa Poppins
CCRL
CCRL
-
- Posts: 4052
- Joined: Thu May 15, 2008 9:57 pm
- Location: Berlin, Germany
- Full name: Sven Schüle
Re: Jumbo 0.6.96 64-bit Gauntlet for CCRL 40/40
Thanks for the tests! When comparing current CCRL Jumbo ratings to my own test results for versions 0.6.* I see some significant differences for 0.6.66 and 0.6.96 which I do not understand. For older Jumbo versions both CCRL and own test ratings are matching pretty well. Here are my relative ratings (generated with BayesElo from many games between many different Jumbo versions, TC 40/0:03):
Is it possible that Jumbo gauntlets were run on very old hardware? That would be a plausible explanation at least for not seeing the jump from 0.6.51 to 0.6.66 on CCRL since one of the main changes from 0.6.51 to 0.6.66 was a compiler flag optimization that will only help with SSE 4.2 or newer.
Code: Select all
Jumbo 0.6.96 166 8 7 6387 51% 156 33%
Jumbo 0.6.66 122 11 11 2592 50% 119 32%
Jumbo 0.6.51 83 10 10 3248 48% 95 34%
Jumbo 0.6.10 81 7 7 9004 52% 69 33%
Jumbo 0.6.31 80 13 13 2055 50% 78 30%
Jumbo 0.6.35 74 11 11 2800 51% 71 35%
Sven Schüle (engine author: Jumbo, KnockOut, Surprise)
-
- Posts: 919
- Joined: Tue Nov 24, 2015 9:11 pm
- Location: upstate
Re: Jumbo 0.6.96 64-bit Gauntlet for CCRL 40/40
I don't think you can expect LTC results mirror, or even closely follow, those obtained in hyper-bullet tests, Sven. Sometimes they do, other times they don't.
If you're looking for a confirmation of your results our blitz list would be a better place to look. Currently v0.6.96 is +20 Elo over v0.6.66 there which, given that self-play performance is typically 2x of the performance vs. other engines, looks about right, albeit with error margins as wide as the difference. The margins should contract appreciably after the next update as the game counts will nearly double for each version, then we should see the difference more clearly.
All of my hardware supports POPCNT. AFAIK, one of Grahams's boxes doesn't. Taken by itself this factor is largely irrelevant: the speedup offered by POPCNT (~5%) translates to about 5 Elo, a difference that will never show on the 40/40 list due to the number of games required to confirm it.
Most of the v0.6.66 games are mine; at the moment all of v0.6.96 games are Graham's but I'm running a fillup test to bring the number closer to that of v0.6.66 for a higher-LOS comparison. The results should be in by Saturday night, then we'll see. Frankly, I don't expect v0.6.96 to do better than get level with the previous version but seeing the difference on the blitz list sparked some hope.
IMO, bullet TCs are fine for no-regression testing but to find tangible gains at LTC you'd need to slow down your TC by a factor of 20 at least.
If you're looking for a confirmation of your results our blitz list would be a better place to look. Currently v0.6.96 is +20 Elo over v0.6.66 there which, given that self-play performance is typically 2x of the performance vs. other engines, looks about right, albeit with error margins as wide as the difference. The margins should contract appreciably after the next update as the game counts will nearly double for each version, then we should see the difference more clearly.
All of my hardware supports POPCNT. AFAIK, one of Grahams's boxes doesn't. Taken by itself this factor is largely irrelevant: the speedup offered by POPCNT (~5%) translates to about 5 Elo, a difference that will never show on the 40/40 list due to the number of games required to confirm it.
Most of the v0.6.66 games are mine; at the moment all of v0.6.96 games are Graham's but I'm running a fillup test to bring the number closer to that of v0.6.66 for a higher-LOS comparison. The results should be in by Saturday night, then we'll see. Frankly, I don't expect v0.6.96 to do better than get level with the previous version but seeing the difference on the blitz list sparked some hope.
IMO, bullet TCs are fine for no-regression testing but to find tangible gains at LTC you'd need to slow down your TC by a factor of 20 at least.
Tirsa Poppins
CCRL
CCRL
-
- Posts: 3196
- Joined: Fri May 26, 2006 3:00 am
- Location: WY, USA
- Full name: Michael Sherwin
Re: Jumbo 0.6.96 64-bit Gauntlet for CCRL 40/40
Or maybe on the gauntlet machine they don't serve peanuts.
If you are on a sidewalk and the covid goes beep beep
Just step aside or you might have a bit of heat
Covid covid runs through the town all day
Can the people ever change their ways
Sherwin the covid's after you
Sherwin if it catches you you're through
Just step aside or you might have a bit of heat
Covid covid runs through the town all day
Can the people ever change their ways
Sherwin the covid's after you
Sherwin if it catches you you're through
-
- Posts: 41455
- Joined: Sun Feb 26, 2006 10:52 am
- Location: Auckland, NZ
Re: Jumbo 0.6.96 64-bit Gauntlet for CCRL 40/40
Michael Sherwin wrote: ↑Fri Apr 05, 2019 5:46 am Or maybe on the gauntlet machine they don't serve peanuts.
gbanksnz at gmail.com
-
- Posts: 4052
- Joined: Thu May 15, 2008 9:57 pm
- Location: Berlin, Germany
- Full name: Sven Schüle
Re: Jumbo 0.6.96 64-bit Gauntlet for CCRL 40/40
Of course I agree.
You are right but even there the noticeable improvement of v0.6.66 over previous versions is invisible (see below).If you're looking for a confirmation of your results our blitz list would be a better place to look. Currently v0.6.96 is +20 Elo over v0.6.66 there [...]
My point was not about POPCNT. In 0.6.66 I had introduced a compile optimization that has no effect on pre-SSE4.2 hardware but should improve playing strength significantly with all time controls on modern hardware.All of my hardware supports POPCNT. AFAIK, one of Grahams's boxes doesn't. Taken by itself this factor is largely irrelevant: the speedup offered by POPCNT (~5%) translates to about 5 Elo, a difference that will never show on the 40/40 list due to the number of games required to confirm it.
Certainly correct but there are some kinds of changes for which we know that they help for all time controls. Among these are, for instance:IMO, bullet TCs are fine for no-regression testing but to find tangible gains at LTC you'd need to slow down your TC by a factor of 20 at least.
- certain compile improvements
- fixing some severe bugs
- some kinds of evaluation improvements that do not slow down the engine (e.g. parameter tuning)
Anyway, I was just curious about the hardware that was used. You explained it, and it's ok for me, no problem at all - considering the very low strength level of Jumbo and also the current error bars the whole issue is certainly almost irrelevant. Thanks for your comments and your big efforts!
Sven Schüle (engine author: Jumbo, KnockOut, Surprise)
-
- Posts: 3196
- Joined: Fri May 26, 2006 3:00 am
- Location: WY, USA
- Full name: Michael Sherwin
Re: Jumbo 0.6.96 64-bit Gauntlet for CCRL 40/40
I have started 4 matches with Jumbo and RomiX.
First match is Jumbo 66 bb versus RomiXNoRL at 2+6 using the nooman 3 move 500 pgn 1000 games.
Second match is Jumbo 66 bb versus RomiXNoRL at 40/10 using the Sherwin50.pgn 100 games
Third match is Jumbo 96 bb versus RomiXRL at 2+6 using Sherwin50.pgn 1000 games
Fourth match is Jumbo 96 bb versus RomiXRL at 40/10 using nooman 3 move 500 pgn 1000 games random with not repeat of position
I hope I set everything up correctly because I got a little confused. Test 4 will take awhile. I'm doing the learning test for myself. But the test with no learning is for Sven. So Sven if you want me to add one more test I can since this is running on a 6 core processor. Just let me know.
First match is Jumbo 66 bb versus RomiXNoRL at 2+6 using the nooman 3 move 500 pgn 1000 games.
Second match is Jumbo 66 bb versus RomiXNoRL at 40/10 using the Sherwin50.pgn 100 games
Third match is Jumbo 96 bb versus RomiXRL at 2+6 using Sherwin50.pgn 1000 games
Fourth match is Jumbo 96 bb versus RomiXRL at 40/10 using nooman 3 move 500 pgn 1000 games random with not repeat of position
I hope I set everything up correctly because I got a little confused. Test 4 will take awhile. I'm doing the learning test for myself. But the test with no learning is for Sven. So Sven if you want me to add one more test I can since this is running on a 6 core processor. Just let me know.
If you are on a sidewalk and the covid goes beep beep
Just step aside or you might have a bit of heat
Covid covid runs through the town all day
Can the people ever change their ways
Sherwin the covid's after you
Sherwin if it catches you you're through
Just step aside or you might have a bit of heat
Covid covid runs through the town all day
Can the people ever change their ways
Sherwin the covid's after you
Sherwin if it catches you you're through
-
- Posts: 4052
- Joined: Thu May 15, 2008 9:57 pm
- Location: Berlin, Germany
- Full name: Sven Schüle
Re: Jumbo 0.6.96 64-bit Gauntlet for CCRL 40/40
Sounds interesting, thanks for your tests! If you actually have some spare resources left for another test then I would be interested in something that would allow to compare Jumbo 0.6.96 to 0.6.66 under identical conditions, e.g. "Jumbo 96 bb versus RomiXNoRL at 2+6 using the nooman 3 move 500 pgn 1000 games" (that is your test 1 but with 96).Michael Sherwin wrote: ↑Fri Apr 05, 2019 9:47 am So Sven if you want me to add one more test I can since this is running on a 6 core processor. Just let me know.
Sven Schüle (engine author: Jumbo, KnockOut, Surprise)