Laskos wrote:I tested this morning the latest network, ID69, and compared to 2 days older network ID56, the latest performs significantly better in my opening positional suite.
Code: Select all
[Search parameters: MaxDepth=99 MaxTime=20.0 DepthDelta=2 MinDepth=7 MinTime=0.1]
Engine : Correct TotalPos Corr% AveT(s) MaxT(s) TestFile
Komodo 10.2 64-bit : 145 200 72.5 2.0 20.0 openings200beta07.epd
Houdini 5.01 Pro x64 : 144 200 72.0 2.4 20.0 openings200beta07.epd
Stockfish 8 64 BMI2 : 141 200 70.5 2.0 20.0 openings200beta07.epd
Houdini 5.01 Pro x64 Tactical : 139 200 69.5 2.3 20.0 openings200beta07.epd
Deep Shredder 13 x64 : 128 200 64.0 2.7 20.0 openings200beta07.epd
Houdini 4 Pro x64 : 126 200 63.0 1.8 20.0 openings200beta07.epd
Andscacs 0.88n : 123 200 61.5 2.4 20.0 openings200beta07.epd
Houdini 4 Pro x64 Tactical : 120 200 60.0 1.6 20.0 openings200beta07.epd
Nirvanachess 2.3 : 119 200 59.5 1.8 20.0 openings200beta07.epd
Fire 5 x64 : 110 200 55.0 3.0 20.0 openings200beta07.epd
Texel 1.06 64-bit : 110 200 55.0 1.6 20.0 openings200beta07.epd
Fritz 15 (3227) : 102 200 51.0 1.9 20.0 openings200beta07.epd
LCZero ************* ID69 : 98 200 49.0 2.7 20.0 openings200beta07.epd
Fruit 2.1 (2685) : 91 200 45.5 1.5 20.0 openings200beta07.epd
Sjaak II 1.3.1 (2194) : 75 200 37.5 4.0 20.0 openings200beta07.epd
BikJump v2.01 (2098) : 74 200 37.0 1.6 20.0 openings200beta07.epd
Maximum time was 20s/position.
LC0 seems already close to very strong engines in this opening suite. At this pace of advancement in positional understanding, I will be very curious how it develops.
I have found something interesting about STS suite (1500 positions), and confirmed the scaling behavior. STS suite always seemed to me as over-analyzed by Rybka (1.0b?) engine and maybe some other engines. In my opening positional test suite (Openings200beat07.epd, 200 positions), I used engines only to eliminate tactical positions and to check that engines vary on move selection. But the positions and solutions were selected mostly according to huge mostly human games databases (often restricted to FIDE Elo above 2200 or so) and outcomes.
In my opening suite, LC0 (ID69) came at 20s/position significantly above Fruit 2.1, close to Fritz 15:
Code: Select all
[Search parameters: MaxDepth=99 MaxTime=20.0 DepthDelta=2 MinDepth=7 MinTime=0.1]
Engine : Correct TotalPos Corr% AveT(s) MaxT(s) TestFile
Komodo 10.2 64-bit : 145 200 72.5 2.0 20.0 openings200beta07.epd
Houdini 5.01 Pro x64 : 144 200 72.0 2.4 20.0 openings200beta07.epd
Stockfish 8 64 BMI2 : 141 200 70.5 2.0 20.0 openings200beta07.epd
Houdini 5.01 Pro x64 Tactical : 139 200 69.5 2.3 20.0 openings200beta07.epd
Deep Shredder 13 x64 : 128 200 64.0 2.7 20.0 openings200beta07.epd
Houdini 4 Pro x64 : 126 200 63.0 1.8 20.0 openings200beta07.epd
Andscacs 0.88n : 123 200 61.5 2.4 20.0 openings200beta07.epd
Houdini 4 Pro x64 Tactical : 120 200 60.0 1.6 20.0 openings200beta07.epd
Nirvanachess 2.3 : 119 200 59.5 1.8 20.0 openings200beta07.epd
Fire 5 x64 : 110 200 55.0 3.0 20.0 openings200beta07.epd
Texel 1.06 64-bit : 110 200 55.0 1.6 20.0 openings200beta07.epd
Fritz 15 (3227 CCRL) : 102 200 51.0 1.9 20.0 openings200beta07.epd
LCZero ************* ID69 : 98 200 49.0 2.7 20.0 openings200beta07.epd
Fruit 2.1 (2685 CCRL) : 91 200 45.5 1.5 20.0 openings200beta07.epd
Sjaak II 1.3.1 (2194 CCRL) : 75 200 37.5 4.0 20.0 openings200beta07.epd
BikJump v2.01 (2098 CCRL) : 74 200 37.0 1.6 20.0 openings200beta07.epd
I filtered STS 1500 position for positions containing 28-32 men, i.e. opening and early middlegame positions. There are 209 of them. On this suite of 209 positions, I expected similar results to those with my suite, but it is not so. LC0 comes the level of BikJump v2.01, about 2100 CCRL Elo level, significantly below Fruit 2.1 (CCRL about 2700 Elo level):
STS (209 positions)
5s/position
Code: Select all
[Search parameters: MaxDepth=99 MaxTime=5.0 DepthDelta=2 MinDepth=7 MinTime=0.1]
Fruit 2.1 (2685) : score=145/209 [averages on correct positions: depth=5.2 time=0.33 nodes=704167]
BikJump 2.01 (2098) : score=113/209 [averages on correct positions: depth=4.9 time=0.71 nodes=1710508]
LC0 ID 69 ******* : score=107/209 [averages on correct positions: depth=14.7 time=0.70 nodes=867]
20s/position
Code: Select all
[Search parameters: MaxDepth=99 MaxTime=20.0 DepthDelta=2 MinDepth=7 MinTime=0.1]
Fruit 2.1 (2685) : score=164/209 [averages on correct positions: depth=6.0 time=1.52 nodes=3162284]
LC0 ID 69 ******* : score=128/209 [averages on correct positions: depth=15.0 time=1.82 nodes=2253]
BikJump 2.01 (2098) : score=126/209 [averages on correct positions: depth=5.7 time=2.53 nodes=6359373]
Remark about the scaling: from 5s to 20s, Fruit 2.1 improves by 19 points, LC0 by 21 points and BikJump by 13 points. As on this suite, LC0 should be compared to BikJump (similar performance), the scaling of LC0 is significantly better that that of BikJump.
It is possible that many STS solutions are derived from engine analysis in the same paradigm of PST + Material eval and alpha-beta search. The standard engines might converge artificially on the solutions. That would be some sort of explanation of why my suite (much less engine analyzed) results are very different from LC0 point of view in performance.