SPCC: Testrun of Allie 0.5 LS11.1 finished

Discussion of computer chess matches and engine tournaments.

Moderators: hgm, Rebel, chrisw

User avatar
pohl4711
Posts: 2432
Joined: Sat Sep 03, 2011 7:25 am
Location: Berlin, Germany
Full name: Stefan Pohl

SPCC: Testrun of Allie 0.5 LS11.1 finished

Post by pohl4711 »

Testrun of Allie 0.5 Leelenstein 11.1 Net finished

https://www.sp-cc.de

(Perhaps you have to clear your browsercache or reload the website)
Albert Silver
Posts: 3019
Joined: Wed Mar 08, 2006 9:57 pm
Location: Rio de Janeiro, Brazil

Re: SPCC: Testrun of Allie 0.5 LS11.1 finished

Post by Albert Silver »

My personal testing of Fat Fritz says it does gain at longer time controls against SF compared to shorter time controls. If you do test it, try both if you can. I would be quite curious to see this.
"Tactics are the bricks and sticks that make up a game, but positional play is the architectural blueprint."
Albert Silver
Posts: 3019
Joined: Wed Mar 08, 2006 9:57 pm
Location: Rio de Janeiro, Brazil

Re: SPCC: Testrun of Allie 0.5 LS11.1 finished

Post by Albert Silver »

Albert Silver wrote: Mon Oct 28, 2019 1:58 am My personal testing of Fat Fritz says it does gain at longer time controls against SF compared to shorter time controls. If you do test it, try both if you can. I would be quite curious to see this.
BTW, I saw that you run your tests on a laptop. Even the GPU games. You know that after some time it will throttle the speed from overheating, so that any speed calibrations you might do at the start will not be valid after 60 minutes of constant play. This could easily effect your ratio. I would suggest running 30-40 games and then collecting the average nps per engine per game in the last 10 games as a reference for the ratio. Just a suggestion.
"Tactics are the bricks and sticks that make up a game, but positional play is the architectural blueprint."
Leo
Posts: 1078
Joined: Fri Sep 16, 2016 6:55 pm
Location: USA/Minnesota
Full name: Leo Anger

Re: SPCC: Testrun of Allie 0.5 LS11.1 finished

Post by Leo »

I wonder what the difference between Allie and Allie Stein is. I still cant believe that Stockfish beat a NN engine in the TCEC Championship. ( I am glad it did.) Perhaps LCO was trained against SF and Allie Stein wasn't.
Advanced Micro Devices fan.
Modern Times
Posts: 3546
Joined: Thu Jun 07, 2012 11:02 pm

Re: SPCC: Testrun of Allie 0.5 LS11.1 finished

Post by Modern Times »

Albert - he says on his site that he uses MSI AfterBurner to reduce the GPU speed as much as possible. So that will reduce the heat generated. It is also possible to cool laptops with special laptop cooling pads as you will know. I'm sure he knows what he is doing.
Albert Silver
Posts: 3019
Joined: Wed Mar 08, 2006 9:57 pm
Location: Rio de Janeiro, Brazil

Re: SPCC: Testrun of Allie 0.5 LS11.1 finished

Post by Albert Silver »

Leo wrote: Mon Oct 28, 2019 6:03 pm I wonder what the difference between Allie and Allie Stein is. I still cant believe that Stockfish beat a NN engine in the TCEC Championship. ( I am glad it did.) Perhaps LCO was trained against SF and Allie Stein wasn't.
Lc0 was trained against itself, but was tuned against SF, meaning its settings were optimized for best performance against SF. Still, remember that Leela was in that same championship, but failed to reach the final.
"Tactics are the bricks and sticks that make up a game, but positional play is the architectural blueprint."
User avatar
pohl4711
Posts: 2432
Joined: Sat Sep 03, 2011 7:25 am
Location: Berlin, Germany
Full name: Stefan Pohl

Re: SPCC: Testrun of Allie 0.5 LS11.1 finished

Post by pohl4711 »

Albert Silver wrote: Mon Oct 28, 2019 5:21 pm
BTW, I saw that you run your tests on a laptop. Even the GPU games. You know that after some time it will throttle the speed from overheating, so that any speed calibrations you might do at the start will not be valid after 60 minutes of constant play. This could easily effect your ratio.
Total nonsense.
The CPU runs without Intel TurboBoost. And the RTX GPU is slowed down -30% by MSI Afterburner, because otherwise it is too fast for the Hexacore mobile-CPU (for a valid Leela-Ratio of 1.3). Both components (CPU and GPU) get never hotter than 65 degrees (Celsius). And The MSI Afterburner displays the GPU-speed on screen. There was never any slowdown. Not even in summer. Of course not, because 65 degrees are no problem for any mobile hardware.
And I clean all notebook-fans with a vaccum cleaner regularly.
User avatar
pohl4711
Posts: 2432
Joined: Sat Sep 03, 2011 7:25 am
Location: Berlin, Germany
Full name: Stefan Pohl

Re: SPCC: Testrun of Allie 0.5 LS11.1 finished

Post by pohl4711 »

Albert Silver wrote: Mon Oct 28, 2019 1:58 am My personal testing of Fat Fritz says it does gain at longer time controls against SF compared to shorter time controls. If you do test it, try both if you can. I would be quite curious to see this.
I will definitly not test FatFritz with other conditions. Why should I?
My tests show, that lc0 does not benefit from more thinking-time, so why should FatFritz, which is nearly the same as lc0?
From my website:

I always believed, that it is better to play more games with faster time-controls, than playing less games with longer time-controls. Because a higher number of games makes the errorbar smaller and the results more valid. But Lc0 is running much slower, than the classical AB-Engines (like Stockfish) - will that lead to distorted results, when lc0 has to play with a very short thinking-time? To answer that question, we can compare the two testruns of lc0 with Net T40.T8.610:
I did a testrun with 50''+500ms (my new testsetting) (short) and with 150''+1500ms (my old testsetting) (long). I did a ORDO-calculation of both in one ratinglist. Here the result: (you can see, that Lc0 does not benefit from the 3x longer thinking-time)

Code: Select all

   Program                      Elo    +    -   Games   Score   Av.Op.  Draws

   1 BrainFish-2 190531 bmi2    : 3577    9    9  5000    79.7 %   3328   38.1 %
   2 Stockfish 190622 bmi2      : 3531    7    7  5500    73.6 %   3343   43.1 %
   3 Stockfish 190504 bmi2      : 3527    8    8  5100    74.7 %   3329   44.1 %
   4 Stockfish 10 181129        : 3508    6    6 12000    77.2 %   3284   39.5 %
   5 Lc0 0.21.2 T40.T8.610      : 3490    9    9  3000    66.1 %   3365   46.0 % (short)
   6 Lc0 0.21.1 T40.T8.610      : 3486   20   20   700    68.6 %   3337   49.9 % (long)
   7 Stockfish 9 180201         : 3461    8    8  5000    74.9 %   3258   41.7 %
   8 Houdini 6 pext             : 3431    4    4 18600    61.8 %   3337   49.5 %
   9 Komodo 13.01 bmi2          : 3402    6    6  9500    52.3 %   3386   52.3 %
  10 Komodo 12.3 bmi2           : 3395    6    6  9100    59.4 %   3323   50.3 %
  11 Komodo 13.01 MCTS          : 3298    7    7  6000    42.2 %   3358   54.4 %
  12 Fire 7.1 popc              : 3281    4    4 18600    42.3 %   3345   50.3 %
  13 Xiphos 0.5.3 bmi2          : 3272    5    5 12600    34.1 %   3399   48.3 %
  14 Ethereal 11.53 pext        : 3270    7    7  5500    34.6 %   3389   50.0 %
  15 Komodo 12.3 MCTS           : 3261    6    6  8000    43.7 %   3312   47.1 %
  16 Ethereal 11.25 pext        : 3253    5    5 12100    33.0 %   3391   46.2 %
  17 Laser 1.7 bmi2             : 3202    7    7  6100    30.5 %   3357   45.5 %
Albert Silver
Posts: 3019
Joined: Wed Mar 08, 2006 9:57 pm
Location: Rio de Janeiro, Brazil

Re: SPCC: Testrun of Allie 0.5 LS11.1 finished

Post by Albert Silver »

pohl4711 wrote: Tue Oct 29, 2019 7:46 am
Albert Silver wrote: Mon Oct 28, 2019 1:58 am My personal testing of Fat Fritz says it does gain at longer time controls against SF compared to shorter time controls. If you do test it, try both if you can. I would be quite curious to see this.
I will definitly not test FatFritz with other conditions. Why should I?
My tests show, that lc0 does not benefit from more thinking-time, so why should FatFritz, which is nearly the same as lc0?
Obviously you have information I do not, so who am I to argue?
"Tactics are the bricks and sticks that make up a game, but positional play is the architectural blueprint."
User avatar
pohl4711
Posts: 2432
Joined: Sat Sep 03, 2011 7:25 am
Location: Berlin, Germany
Full name: Stefan Pohl

Re: SPCC: Testrun of Allie 0.5 LS11.1 finished

Post by pohl4711 »

Albert Silver wrote: Tue Oct 29, 2019 10:13 pm
pohl4711 wrote: Tue Oct 29, 2019 7:46 am
Albert Silver wrote: Mon Oct 28, 2019 1:58 am My personal testing of Fat Fritz says it does gain at longer time controls against SF compared to shorter time controls. If you do test it, try both if you can. I would be quite curious to see this.
I will definitly not test FatFritz with other conditions. Why should I?
My tests show, that lc0 does not benefit from more thinking-time, so why should FatFritz, which is nearly the same as lc0?
Obviously you have information I do not, so who am I to argue?
I believe, that FatFritz is not as strong as the latest Stockfish-Dev (with a valid Leela-Ratio) in a head-to-head battle, because no lc0 with any net is stronger than Stockfish today.
And because of this, I think, it is the same old story of the last 35 years: If Engine A is compared only with a stronger Engine B, the result of a head-to-head battle gets closer to 50%-50% result with more thinking-time, because the number of draws is rising. So, the Elo difference between A and B shrinks, and so, it seems, the weaker engine gets stronger with more thinking time. But this is just an illusion, which fades away in a ratinglist-testrun.