Laskos wrote:I tested this morning the latest network, ID69, and compared  to 2 days older network ID56, the latest performs significantly better in my opening positional suite.
Code: Select all
[Search parameters: MaxDepth=99   MaxTime=20.0   DepthDelta=2   MinDepth=7   MinTime=0.1] 
Engine                         : Correct  TotalPos  Corr%  AveT(s)  MaxT(s)  TestFile 
      
Komodo 10.2 64-bit             :     145       200   72.5      2.0     20.0  openings200beta07.epd 
Houdini 5.01 Pro x64           :     144       200   72.0      2.4     20.0  openings200beta07.epd    
Stockfish 8 64 BMI2            :     141       200   70.5      2.0     20.0  openings200beta07.epd 
Houdini 5.01 Pro x64 Tactical  :     139       200   69.5      2.3     20.0  openings200beta07.epd      
Deep Shredder 13 x64           :     128       200   64.0      2.7     20.0  openings200beta07.epd    
Houdini 4 Pro x64              :     126       200   63.0      1.8     20.0  openings200beta07.epd    
Andscacs 0.88n                 :     123       200   61.5      2.4     20.0  openings200beta07.epd 
Houdini 4 Pro x64 Tactical     :     120       200   60.0      1.6     20.0  openings200beta07.epd 
Nirvanachess 2.3               :     119       200   59.5      1.8     20.0  openings200beta07.epd 
Fire 5 x64                     :     110       200   55.0      3.0     20.0  openings200beta07.epd    
Texel 1.06 64-bit              :     110       200   55.0      1.6     20.0  openings200beta07.epd    
Fritz 15       (3227)          :     102       200   51.0      1.9     20.0  openings200beta07.epd  
LCZero  *************  ID69    :      98       200   49.0      2.7     20.0  openings200beta07.epd 
  
Fruit 2.1      (2685)          :      91       200   45.5      1.5     20.0  openings200beta07.epd  
Sjaak II 1.3.1 (2194)          :      75       200   37.5      4.0     20.0  openings200beta07.epd    
BikJump v2.01  (2098)          :      74       200   37.0      1.6     20.0  openings200beta07.epd
Maximum time was 20s/position.
LC0 seems already close to very strong engines in this opening suite. At this pace of advancement in positional understanding, I will be very curious how it develops.
 
I have found something interesting about STS suite (1500 positions), and confirmed the scaling behavior. STS suite always seemed to me as over-analyzed by Rybka (1.0b?) engine and maybe some other engines. In my opening positional test suite (Openings200beat07.epd, 200 positions), I used engines only to eliminate tactical positions and to check that engines vary on move selection. But the positions and solutions were selected mostly according to huge mostly human games databases (often restricted to FIDE Elo above 2200 or so) and outcomes. 
In my opening suite, LC0 (ID69) came at 20s/position significantly above Fruit 2.1, close to Fritz 15:
Code: Select all
[Search parameters: MaxDepth=99   MaxTime=20.0   DepthDelta=2   MinDepth=7   MinTime=0.1] 
Engine                         : Correct  TotalPos  Corr%  AveT(s)  MaxT(s)  TestFile 
      
Komodo 10.2 64-bit             :     145       200   72.5      2.0     20.0  openings200beta07.epd 
Houdini 5.01 Pro x64           :     144       200   72.0      2.4     20.0  openings200beta07.epd    
Stockfish 8 64 BMI2            :     141       200   70.5      2.0     20.0  openings200beta07.epd 
Houdini 5.01 Pro x64 Tactical  :     139       200   69.5      2.3     20.0  openings200beta07.epd      
Deep Shredder 13 x64           :     128       200   64.0      2.7     20.0  openings200beta07.epd    
Houdini 4 Pro x64              :     126       200   63.0      1.8     20.0  openings200beta07.epd    
Andscacs 0.88n                 :     123       200   61.5      2.4     20.0  openings200beta07.epd 
Houdini 4 Pro x64 Tactical     :     120       200   60.0      1.6     20.0  openings200beta07.epd 
Nirvanachess 2.3               :     119       200   59.5      1.8     20.0  openings200beta07.epd 
Fire 5 x64                     :     110       200   55.0      3.0     20.0  openings200beta07.epd    
Texel 1.06 64-bit              :     110       200   55.0      1.6     20.0  openings200beta07.epd    
Fritz 15       (3227 CCRL)     :     102       200   51.0      1.9     20.0  openings200beta07.epd  
LCZero  *************  ID69    :      98       200   49.0      2.7     20.0  openings200beta07.epd 
  
Fruit 2.1      (2685 CCRL)     :      91       200   45.5      1.5     20.0  openings200beta07.epd  
Sjaak II 1.3.1 (2194 CCRL)     :      75       200   37.5      4.0     20.0  openings200beta07.epd    
BikJump v2.01  (2098 CCRL)     :      74       200   37.0      1.6     20.0  openings200beta07.epd
I filtered STS 1500 position for positions containing 28-32 men, i.e. opening and early middlegame positions. There are 209 of them. On this suite of 209 positions, I expected similar results to those with my suite, but it is not so. LC0 comes the level of BikJump v2.01, about 2100 CCRL Elo level, significantly below Fruit 2.1 (CCRL about 2700 Elo level):
STS (209 positions)
5s/position
Code: Select all
[Search parameters: MaxDepth=99   MaxTime=5.0   DepthDelta=2   MinDepth=7   MinTime=0.1] 
Fruit 2.1    (2685) :    score=145/209 [averages on correct positions: depth=5.2 time=0.33 nodes=704167]
BikJump 2.01 (2098) :    score=113/209 [averages on correct positions: depth=4.9 time=0.71 nodes=1710508]
LC0 ID 69 *******   :    score=107/209 [averages on correct positions: depth=14.7 time=0.70 nodes=867]
Code: Select all
[Search parameters: MaxDepth=99   MaxTime=20.0   DepthDelta=2   MinDepth=7   MinTime=0.1] 
Fruit 2.1    (2685) :    score=164/209 [averages on correct positions: depth=6.0 time=1.52 nodes=3162284]
LC0 ID 69 *******   :    score=128/209 [averages on correct positions: depth=15.0 time=1.82 nodes=2253]
BikJump 2.01 (2098) :    score=126/209 [averages on correct positions: depth=5.7 time=2.53 nodes=6359373]
Remark about the scaling: from 5s to 20s, Fruit 2.1 improves by 19 points, LC0 by 21 points and BikJump by 13 points. As on this suite, LC0 should be compared to BikJump (similar performance), the scaling of LC0 is significantly better that that of BikJump.
It is possible that many STS solutions are derived from engine analysis in the same paradigm of PST + Material eval and alpha-beta search. The standard engines might converge artificially on the solutions. That would be some sort of explanation of why my suite (much less engine analyzed) results are very different from LC0 point of view in performance.