Testrun of Allie 0.5 Leelenstein 11.1 Net finished
https://www.sp-cc.de
(Perhaps you have to clear your browsercache or reload the website)
SPCC: Testrun of Allie 0.5 LS11.1 finished
Moderators: hgm, Rebel, chrisw
-
- Posts: 2434
- Joined: Sat Sep 03, 2011 7:25 am
- Location: Berlin, Germany
- Full name: Stefan Pohl
-
- Posts: 3019
- Joined: Wed Mar 08, 2006 9:57 pm
- Location: Rio de Janeiro, Brazil
Re: SPCC: Testrun of Allie 0.5 LS11.1 finished
My personal testing of Fat Fritz says it does gain at longer time controls against SF compared to shorter time controls. If you do test it, try both if you can. I would be quite curious to see this.
"Tactics are the bricks and sticks that make up a game, but positional play is the architectural blueprint."
-
- Posts: 3019
- Joined: Wed Mar 08, 2006 9:57 pm
- Location: Rio de Janeiro, Brazil
Re: SPCC: Testrun of Allie 0.5 LS11.1 finished
BTW, I saw that you run your tests on a laptop. Even the GPU games. You know that after some time it will throttle the speed from overheating, so that any speed calibrations you might do at the start will not be valid after 60 minutes of constant play. This could easily effect your ratio. I would suggest running 30-40 games and then collecting the average nps per engine per game in the last 10 games as a reference for the ratio. Just a suggestion.Albert Silver wrote: ↑Mon Oct 28, 2019 1:58 am My personal testing of Fat Fritz says it does gain at longer time controls against SF compared to shorter time controls. If you do test it, try both if you can. I would be quite curious to see this.
"Tactics are the bricks and sticks that make up a game, but positional play is the architectural blueprint."
-
- Posts: 1080
- Joined: Fri Sep 16, 2016 6:55 pm
- Location: USA/Minnesota
- Full name: Leo Anger
Re: SPCC: Testrun of Allie 0.5 LS11.1 finished
I wonder what the difference between Allie and Allie Stein is. I still cant believe that Stockfish beat a NN engine in the TCEC Championship. ( I am glad it did.) Perhaps LCO was trained against SF and Allie Stein wasn't.
Advanced Micro Devices fan.
-
- Posts: 3546
- Joined: Thu Jun 07, 2012 11:02 pm
Re: SPCC: Testrun of Allie 0.5 LS11.1 finished
Albert - he says on his site that he uses MSI AfterBurner to reduce the GPU speed as much as possible. So that will reduce the heat generated. It is also possible to cool laptops with special laptop cooling pads as you will know. I'm sure he knows what he is doing.
-
- Posts: 3019
- Joined: Wed Mar 08, 2006 9:57 pm
- Location: Rio de Janeiro, Brazil
Re: SPCC: Testrun of Allie 0.5 LS11.1 finished
Lc0 was trained against itself, but was tuned against SF, meaning its settings were optimized for best performance against SF. Still, remember that Leela was in that same championship, but failed to reach the final.
"Tactics are the bricks and sticks that make up a game, but positional play is the architectural blueprint."
-
- Posts: 2434
- Joined: Sat Sep 03, 2011 7:25 am
- Location: Berlin, Germany
- Full name: Stefan Pohl
Re: SPCC: Testrun of Allie 0.5 LS11.1 finished
Total nonsense.Albert Silver wrote: ↑Mon Oct 28, 2019 5:21 pm
BTW, I saw that you run your tests on a laptop. Even the GPU games. You know that after some time it will throttle the speed from overheating, so that any speed calibrations you might do at the start will not be valid after 60 minutes of constant play. This could easily effect your ratio.
The CPU runs without Intel TurboBoost. And the RTX GPU is slowed down -30% by MSI Afterburner, because otherwise it is too fast for the Hexacore mobile-CPU (for a valid Leela-Ratio of 1.3). Both components (CPU and GPU) get never hotter than 65 degrees (Celsius). And The MSI Afterburner displays the GPU-speed on screen. There was never any slowdown. Not even in summer. Of course not, because 65 degrees are no problem for any mobile hardware.
And I clean all notebook-fans with a vaccum cleaner regularly.
-
- Posts: 2434
- Joined: Sat Sep 03, 2011 7:25 am
- Location: Berlin, Germany
- Full name: Stefan Pohl
Re: SPCC: Testrun of Allie 0.5 LS11.1 finished
I will definitly not test FatFritz with other conditions. Why should I?Albert Silver wrote: ↑Mon Oct 28, 2019 1:58 am My personal testing of Fat Fritz says it does gain at longer time controls against SF compared to shorter time controls. If you do test it, try both if you can. I would be quite curious to see this.
My tests show, that lc0 does not benefit from more thinking-time, so why should FatFritz, which is nearly the same as lc0?
From my website:
I always believed, that it is better to play more games with faster time-controls, than playing less games with longer time-controls. Because a higher number of games makes the errorbar smaller and the results more valid. But Lc0 is running much slower, than the classical AB-Engines (like Stockfish) - will that lead to distorted results, when lc0 has to play with a very short thinking-time? To answer that question, we can compare the two testruns of lc0 with Net T40.T8.610:
I did a testrun with 50''+500ms (my new testsetting) (short) and with 150''+1500ms (my old testsetting) (long). I did a ORDO-calculation of both in one ratinglist. Here the result: (you can see, that Lc0 does not benefit from the 3x longer thinking-time)
Code: Select all
Program Elo + - Games Score Av.Op. Draws
1 BrainFish-2 190531 bmi2 : 3577 9 9 5000 79.7 % 3328 38.1 %
2 Stockfish 190622 bmi2 : 3531 7 7 5500 73.6 % 3343 43.1 %
3 Stockfish 190504 bmi2 : 3527 8 8 5100 74.7 % 3329 44.1 %
4 Stockfish 10 181129 : 3508 6 6 12000 77.2 % 3284 39.5 %
5 Lc0 0.21.2 T40.T8.610 : 3490 9 9 3000 66.1 % 3365 46.0 % (short)
6 Lc0 0.21.1 T40.T8.610 : 3486 20 20 700 68.6 % 3337 49.9 % (long)
7 Stockfish 9 180201 : 3461 8 8 5000 74.9 % 3258 41.7 %
8 Houdini 6 pext : 3431 4 4 18600 61.8 % 3337 49.5 %
9 Komodo 13.01 bmi2 : 3402 6 6 9500 52.3 % 3386 52.3 %
10 Komodo 12.3 bmi2 : 3395 6 6 9100 59.4 % 3323 50.3 %
11 Komodo 13.01 MCTS : 3298 7 7 6000 42.2 % 3358 54.4 %
12 Fire 7.1 popc : 3281 4 4 18600 42.3 % 3345 50.3 %
13 Xiphos 0.5.3 bmi2 : 3272 5 5 12600 34.1 % 3399 48.3 %
14 Ethereal 11.53 pext : 3270 7 7 5500 34.6 % 3389 50.0 %
15 Komodo 12.3 MCTS : 3261 6 6 8000 43.7 % 3312 47.1 %
16 Ethereal 11.25 pext : 3253 5 5 12100 33.0 % 3391 46.2 %
17 Laser 1.7 bmi2 : 3202 7 7 6100 30.5 % 3357 45.5 %
-
- Posts: 3019
- Joined: Wed Mar 08, 2006 9:57 pm
- Location: Rio de Janeiro, Brazil
Re: SPCC: Testrun of Allie 0.5 LS11.1 finished
Obviously you have information I do not, so who am I to argue?pohl4711 wrote: ↑Tue Oct 29, 2019 7:46 amI will definitly not test FatFritz with other conditions. Why should I?Albert Silver wrote: ↑Mon Oct 28, 2019 1:58 am My personal testing of Fat Fritz says it does gain at longer time controls against SF compared to shorter time controls. If you do test it, try both if you can. I would be quite curious to see this.
My tests show, that lc0 does not benefit from more thinking-time, so why should FatFritz, which is nearly the same as lc0?
"Tactics are the bricks and sticks that make up a game, but positional play is the architectural blueprint."
-
- Posts: 2434
- Joined: Sat Sep 03, 2011 7:25 am
- Location: Berlin, Germany
- Full name: Stefan Pohl
Re: SPCC: Testrun of Allie 0.5 LS11.1 finished
I believe, that FatFritz is not as strong as the latest Stockfish-Dev (with a valid Leela-Ratio) in a head-to-head battle, because no lc0 with any net is stronger than Stockfish today.Albert Silver wrote: ↑Tue Oct 29, 2019 10:13 pmObviously you have information I do not, so who am I to argue?pohl4711 wrote: ↑Tue Oct 29, 2019 7:46 amI will definitly not test FatFritz with other conditions. Why should I?Albert Silver wrote: ↑Mon Oct 28, 2019 1:58 am My personal testing of Fat Fritz says it does gain at longer time controls against SF compared to shorter time controls. If you do test it, try both if you can. I would be quite curious to see this.
My tests show, that lc0 does not benefit from more thinking-time, so why should FatFritz, which is nearly the same as lc0?
And because of this, I think, it is the same old story of the last 35 years: If Engine A is compared only with a stronger Engine B, the result of a head-to-head battle gets closer to 50%-50% result with more thinking-time, because the number of draws is rising. So, the Elo difference between A and B shrinks, and so, it seems, the weaker engine gets stronger with more thinking time. But this is just an illusion, which fades away in a ratinglist-testrun.