Michel wrote: ↑Sun Dec 16, 2018 3:59 pm
Laskos wrote: ↑Sun Dec 16, 2018 2:20 pm
Javier Ros wrote: ↑Sun Dec 16, 2018 12:13 pm
Laskos wrote: ↑Sun Dec 16, 2018 11:41 am
Yes, I only now realized that it would be a good check. I will do that today.
Thanks
Did it, interesting
1min + 1s TC (Lc0 on RTX 2070, SF10 on 4 i7 threads):
Lc0 No Book vs SF10 No Book (Initial Board position):
Score of lc0_v19_11261 vs SF10: 12 - 6 - 22 [0.575] 40
Elo difference: 52.51 +/- 73.05
Lc0 No Book vs SF10 BookX.bin:
Score of lc0_v191_11261 vs SF10: 5 - 16 - 19 [0.362] 40
Elo difference: -98.07 +/- 79.65
Lc0 BookX.bin vs SF10 No Book:
Score of lc0_v19_11261 vs SF10: 5 - 11 - 24 [0.425] 40
Elo difference: -52.51 +/- 68.93
So, again, the conclusion would be that the
variety of openings hurts Lc0. Irrespective whether it is forced into (by a book) or it is not forced (plays all by itself).
yanquis1972 wrote: ↑Sat Dec 15, 2018 8:53 pm
i missed the details about the book you're using, but if it's extremely strong, why isn't the conclusion that SF emerges from the opening with a superior position in the majority of games? again, i haven't seen anyone argue NNs surpassed known opening theory.
Now, how do you deal with that? Lc0 "enabled with the unsurpassed opening theory", and having also time on clock advantage because of the book, underperforms by 100 Elo points compared to utterly cherry-picked "Initial Board position", heavily used in the paper.
Maybe we can do an average of the two cases, between diversity by SF10 + Book and diversity by Lc0 + Book, to conclude that "Initial Board position" favors A0(Lc0) by 120 Elo points, and the "12 human openings" by 100 Elo points.
The sole argument would remain that the bias vanishes at very long time control, but all my results combined seem to refute that hypothesis. The bias might diminish somewhat at LTC, that's probably all.
I have not completely followed the discussion but I must confess I am puzzled by it.
There is a reliable, time proven method for measuring the strength of chess engines (as practiced by the rating lists). That is a match against a pool of engines using a sufficiently large opening book from which openings are picked at random. There would have been much less room for discussion if this method had also been followed in A0's case.
Yes, if that would have been what they reported in the paper, I would not object. But neither the preprint nor the final paper does that. A0 people or cheerleaders seem to disagree on using an opening book, say to fixed depth, as "A0 would have never entered these sorts of openings". Compared to their results, using for example a diversified 8-mover book, Lc0 performs about 100 Elo points weaker than from their picked openings.
They extensively use "Initial Board position" in their paper, even in 1000 games matches. That is silly, but they seem to argue that Chess is the starting position, board and pieces, and A0 learnt that. Aside from the TCEC openings, all the other results are skewed by picked openings, and probably all to favor A0 compared to a normal tester's book.
I actually have short TC result:
Lc0 No Book vs SF10 No Book (Initial Board position):
Score of lc0_v19_11261 vs SF10: 12 - 6 - 22 [0.575] 40
Elo difference: 52.51 +/- 73.05
And from 8-mover diversified PGN book:
Lc0 Book vs SF10 Book
Score of lc0_v19_11261 vs SF10: 7 - 14 - 19 [0.412] 40
Elo difference: -61.43 +/- 79.47
About 100 Elo points worse performance of Lc0 from a "tester's book".
Matthew (one of the authors of the paper) seems to contest the second "normal tester's" result and that A0 strength in the paper is inflated. He suggested that A0 might steer most of the openings its way, so that forcing it to play the general PGN book openings degrades its strength. Also, he suggests that my results are an artifact of short time control used. I tried to show that A0 cannot steer most of the openings the way it likes (say closed positions with few tactics), and it's probably more correct to rate its strength in the pool of regular engines the way the usual engines are rated, that is, with varied openings.