Stockfish 14 release round the corner

dkappe · Post by **dkappe** » Fri Jul 02, 2021 8:41 pm

Madeleine Birchfield wrote: ↑Fri Jul 02, 2021 8:35 pm The interesting thing about the current argument between the usual suspects is that if either side is correct, then CCRL and other rating lists would have to test Stockfish 14. If dkappe is right and Stockfish did not use Leela data, then there would be no problem with CCRL testing Stockfish 14. If dkappe is wrong and Stockfish did use Leela data, then by previous precedent, such as testing Allie with a Leela net, testing Nemorino 6.00 which used Stockfish data, testing BBC 1.4 which used a Stockfish net, testing Fat Fritz 2 which used Leela data, and so forth, then there would be no problem with CCRL testing Stockfish 14.

The alternative of course would be to remove Fat Fritz 2, Allie, and Nemorino 6.00 from the ratings list, but they are loath to remove any of them.

I have no opinion on whether CCRL should test SF14 or not. The current best net has used Leela training data. That’s acknowledged by the SF devs themselves. I think you are confusing TCEC’s guidelines with CCRL’s guidelines. That’s an uncomfortable place to be.

Madeleine Birchfield · Fri Jul 02, 2021 8:43 pm

dkappe wrote: ↑Fri Jul 02, 2021 8:41 pm
Madeleine Birchfield wrote: ↑Fri Jul 02, 2021 8:35 pm The interesting thing about the current argument between the usual suspects is that if either side is correct, then CCRL and other rating lists would have to test Stockfish 14. If dkappe is right and Stockfish did not use Leela data, then there would be no problem with CCRL testing Stockfish 14. If dkappe is wrong and Stockfish did use Leela data, then by previous precedent, such as testing Allie with a Leela net, testing Nemorino 6.00 which used Stockfish data, testing BBC 1.4 which used a Stockfish net, testing Fat Fritz 2 which used Leela data, and so forth, then there would be no problem with CCRL testing Stockfish 14.

The alternative of course would be to remove Fat Fritz 2, Allie, and Nemorino 6.00 from the ratings list, but they are loath to remove any of them.
I have no opinion on whether CCRL should test SF14 or not. The current best net has used Leela training data. That’s acknowledged by the SF devs themselves. I think you are confusing TCEC’s guidelines with CCRL’s guidelines. That’s an uncomfortable place to be.

Graham Banks has specifically said that CCRL will not be testing Stockfish 14 due to the use of Leela data:

Graham Banks wrote: ↑Fri Jul 02, 2021 10:15 am
bmp1974 wrote: ↑Fri Jul 02, 2021 8:55 am Stockfish 14 may be released in couple of days. It is likely to have an 30-35 elo gain over SF 13.
SF14 with NNUE net trained from Lc0 training games. One gets to see best of both worlds!!
Won't that pose an issue for testing groups?
At present, we only test NNUE engines that have nets trained on their own games.

dkappe · Post by **dkappe** » Fri Jul 02, 2021 8:48 pm

Sopel wrote: ↑Fri Jul 02, 2021 8:40 pm
By 2. I mean that you described an easy way to make every engine's net original, by your standards.

Also I have nothing against what you're claiming regarding to what works with NNUE training or doesn't. However I prefer following results rather than claims in my research, especially if it involves large resource requirements.

I don’t understand your logic on 2. At low nodes (800), most of the mcts engines produce the same data with the same net. They are interchangeable while the nets are not. That’s a good number of nodes for generating NNUE training data, BTW.

As for 1, a small experiment with a few 100m positions would be a good way to test.

connor_mcmonigle · Post by **connor_mcmonigle** » Fri Jul 02, 2021 8:52 pm

dkappe wrote: ↑Fri Jul 02, 2021 8:48 pm
Sopel wrote: ↑Fri Jul 02, 2021 8:40 pm
By 2. I mean that you described an easy way to make every engine's net original, by your standards.

Also I have nothing against what you're claiming regarding to what works with NNUE training or doesn't. However I prefer following results rather than claims in my research, especially if it involves large resource requirements.
As for 1, a small experiment with a few 100m positions would be a good way to test.

This discussion has deviated a great deal from the original discussion, haha. Anyways, 100M positions is far from representative in my opinion. What works best with a 100M position dataset is significantly different from what works best with a 100G position dataset. I believe that with increasing scale, noisier data is actually desirable. The dynamics are very complicated and not well understood in any case

dkappe · Post by **dkappe** » Fri Jul 02, 2021 8:54 pm

connor_mcmonigle wrote: ↑Fri Jul 02, 2021 8:52 pm
dkappe wrote: ↑Fri Jul 02, 2021 8:48 pm
Sopel wrote: ↑Fri Jul 02, 2021 8:40 pm
By 2. I mean that you described an easy way to make every engine's net original, by your standards.

Also I have nothing against what you're claiming regarding to what works with NNUE training or doesn't. However I prefer following results rather than claims in my research, especially if it involves large resource requirements.
As for 1, a small experiment with a few 100m positions would be a good way to test.
This discussion has deviated a great deal from the original discussion, haha. Anyways, 100M positions is far from representative in my opinion. What works best with a 100M position dataset is significantly different from what works best with a 100G position dataset. I believe that with increasing scale, noisier data is actually desirable. The dynamics are very complicated and not well understood in any case

A fair point, but then we could never run any experiments.

Modern Times · Post by **Modern Times** » Fri Jul 02, 2021 8:56 pm

Madeleine Birchfield wrote: ↑Fri Jul 02, 2021 8:35 pm The interesting thing about the current argument between the usual suspects is that if either side is correct, then CCRL and other rating lists would have to test Stockfish 14. If dkappe is right and Stockfish did not use Leela data, then there would be no problem with CCRL testing Stockfish 14. If dkappe is wrong and Stockfish did use Leela data, then by previous precedent, such as testing Allie with a Leela net, testing Nemorino 6.00 which used Stockfish data, testing BBC 1.4 which used a Stockfish net, testing Fat Fritz 2 which used Leela data, and so forth, then there would be no problem with CCRL testing Stockfish 14.

The alternative of course would be to remove Fat Fritz 2, Allie, and Nemorino 6.00 from the ratings list, but they are loath to remove any of them.

Precisely why Stefan Pohl for example tests everything. You avoid all the issues around making rules, trying to establish the facts for a given situation, and interpreting them. It is a pragmatic and sensible approach.

Madeleine Birchfield · Fri Jul 02, 2021 9:02 pm

Modern Times wrote: ↑Fri Jul 02, 2021 8:56 pm Precisely why Stefan Pohl for example tests everything. You avoid all the issues around making rules, trying to establish the facts for a given situation, and interpreting them. It is a pragmatic and sensible approach.

Stefan Pohl is the only person who still regularly tests Leela nets against other engines as well. Other rating lists cater exclusively to CPU-only engines, or have long since stopped testing Leela.

Damir · Post by **Damir** » Fri Jul 02, 2021 9:02 pm

Madeleine Birchfield wrote: ↑Fri Jul 02, 2021 8:43 pm
dkappe wrote: ↑Fri Jul 02, 2021 8:41 pm
Madeleine Birchfield wrote: ↑Fri Jul 02, 2021 8:35 pm The interesting thing about the current argument between the usual suspects is that if either side is correct, then CCRL and other rating lists would have to test Stockfish 14. If dkappe is right and Stockfish did not use Leela data, then there would be no problem with CCRL testing Stockfish 14. If dkappe is wrong and Stockfish did use Leela data, then by previous precedent, such as testing Allie with a Leela net, testing Nemorino 6.00 which used Stockfish data, testing BBC 1.4 which used a Stockfish net, testing Fat Fritz 2 which used Leela data, and so forth, then there would be no problem with CCRL testing Stockfish 14.

The alternative of course would be to remove Fat Fritz 2, Allie, and Nemorino 6.00 from the ratings list, but they are loath to remove any of them.
I have no opinion on whether CCRL should test SF14 or not. The current best net has used Leela training data. That’s acknowledged by the SF devs themselves. I think you are confusing TCEC’s guidelines with CCRL’s guidelines. That’s an uncomfortable place to be.
Graham Banks has specifically said that CCRL will not be testing Stockfish 14 due to the use of Leela data:

Graham Banks wrote: ↑Fri Jul 02, 2021 10:15 am
bmp1974 wrote: ↑Fri Jul 02, 2021 8:55 am Stockfish 14 may be released in couple of days. It is likely to have an 30-35 elo gain over SF 13.
SF14 with NNUE net trained from Lc0 training games. One gets to see best of both worlds!!
Won't that pose an issue for testing groups?
At present, we only test NNUE engines that have nets trained on their own games.

Graham is known for his hypocricy... When other engines use Stockfish data he has no problems testing them... When Stockfish want to try Leela data and network than it is unacceptable... smell like a double standards to me....

Modern Times · Post by **Modern Times** » Fri Jul 02, 2021 9:09 pm

Madeleine Birchfield wrote: ↑Fri Jul 02, 2021 8:43 pm
Graham Banks has specifically said that CCRL will not be testing Stockfish 14 due to the use of Leela data:

I am not aware of any internal discussions in CCRL about Stockfish 14. It doesn't exist yet.

As I said in a previous thread, it is a lose-lose situation. You're damned if you do, and damned if you don't. Personally I like Stefan Pohl's approach to what he tests.

For example, how can you defend testing Stockfish 14 trained on Lc0 games, and not testing Fire 8.1NN trained using Stockfish games ?

Madeleine Birchfield · Fri Jul 02, 2021 9:10 pm

Modern Times wrote: ↑Fri Jul 02, 2021 9:09 pm I am not aware of any internal discussions in CCRL about Stockfish 14. It doesn't exist yet.

Except Stockfish 14 already exists:

https://stockfishchess.org/blog/2021/stockfish-14/

Stockfish 14 release round the corner

Re: Stockfish 14 release round the corner

Re: Stockfish 14 release round the corner

Re: Stockfish 14 release round the corner

Re: Stockfish 14 release round the corner

Re: Stockfish 14 release round the corner

Re: Stockfish 14 release round the corner

Re: Stockfish 14 release round the corner

Re: Stockfish 14 release round the corner

Re: Stockfish 14 release round the corner

Re: Stockfish 14 release round the corner