Dann Corbit wrote: ↑Tue Oct 13, 2020 9:05 pm
Daniel Shawul wrote: ↑Tue Oct 13, 2020 8:45 pm
Second half, 5 mini-match wins
52 - +1.31 + -
54 - +1.06 + = won
56 - +1.19 + -
58 - +0.80 =
60 - +1.14 + = won
62 - +1.17 =
64 - +1.17 =
66 - +1.12 =
68 - +1.20 + -
70 - +0.80 =
72 - -0.93 =
74 - +0.70 + = won
76 - +0.66 + = won
78 - +1.38 + -
80 - +1.05 + = won
82 - +0.83 =
won=SF wins minimatch
Something interesting here is that the most polar positions were not wins
Code: Select all
+0.66 + = won
+0.70 + = won
+0.80 =
+0.80 =
+0.83 =
+1.05 + = won
+1.06 + = won
+1.12 =
+1.14 + = won
+1.17 =
+1.17 =
+1.19 + -
+1.20 + -
+1.31 + -
+1.38 + -
-0.93 =
None of the positions with an offset of 1.17 or more was a win for SF.
I think Kai also had an interesting argument. LC0 is trained for quiet positions. If they added training for lopsided positions, perhaps LC0 would fare better.
I guess it is kind of an expected outcome, now that I think of it. If we do a mind experiment where one side falls out of book up a queen, we will see +- for all of those matches. Indeed, that is what we saw with every opening with a score above +117 centipawns.
So it looks to me like the danger zone for LC0 is capped right around + 115 centipawns.
Not enough games to decide this of course, just a mind experiment and speculation on my part.
In training, the balance between exploring different type of positions and learning the best lines well is controlled by the temperature parameter.
If an 800 node search on the start position does not visit a move well enough, it won't be trained up on much.
For this, g4 is hardly trained upon compared to e4 for example, and I am pretty sure an AB engine will do well in these circumstances.
There is also the network size to consider because you can only learn so much, so you have to choose what you want to learn well.
I think the best solution is to do training with a book, where instead of letting the engine figure out the openings, a book (e.g. TCEC's book) is forced as the opening. This has been tried already and it was found that it can accelerate learning initially but I think it didn't make a difference to overall strength eventually. If TCEC is the goal, a network can be trained specificallly for unbalanced openings.
Note that the high scores >= 1.19 are won by both sides, so like you said there is probably score below that in which SF wins with white and draws with black. I think if openings were chosen by lc0 to be >1.0, both sides will just win their games equally since lc0 squares often understiate the value of the position.