Stockfish 14 release round the corner

Discussion of anything and everything relating to chess playing software and machines.

Moderator: Ras

dkappe
Posts: 1632
Joined: Tue Aug 21, 2018 7:52 pm
Full name: Dietrich Kappe

Re: Stockfish 14 release round the corner

Post by dkappe »

Madeleine Birchfield wrote: Fri Jul 02, 2021 8:35 pm The interesting thing about the current argument between the usual suspects is that if either side is correct, then CCRL and other rating lists would have to test Stockfish 14. If dkappe is right and Stockfish did not use Leela data, then there would be no problem with CCRL testing Stockfish 14. If dkappe is wrong and Stockfish did use Leela data, then by previous precedent, such as testing Allie with a Leela net, testing Nemorino 6.00 which used Stockfish data, testing BBC 1.4 which used a Stockfish net, testing Fat Fritz 2 which used Leela data, and so forth, then there would be no problem with CCRL testing Stockfish 14.

The alternative of course would be to remove Fat Fritz 2, Allie, and Nemorino 6.00 from the ratings list, but they are loath to remove any of them.
I have no opinion on whether CCRL should test SF14 or not. The current best net has used Leela training data. That’s acknowledged by the SF devs themselves. I think you are confusing TCEC’s guidelines with CCRL’s guidelines. That’s an uncomfortable place to be.
Fat Titz by Stockfish, the engine with the bodaciously big net. Remember: size matters. If you want to learn more about this engine just google for "Fat Titz".
Madeleine Birchfield
Posts: 512
Joined: Tue Sep 29, 2020 4:29 pm
Location: Dublin, Ireland
Full name: Madeleine Birchfield

Re: Stockfish 14 release round the corner

Post by Madeleine Birchfield »

dkappe wrote: Fri Jul 02, 2021 8:41 pm
Madeleine Birchfield wrote: Fri Jul 02, 2021 8:35 pm The interesting thing about the current argument between the usual suspects is that if either side is correct, then CCRL and other rating lists would have to test Stockfish 14. If dkappe is right and Stockfish did not use Leela data, then there would be no problem with CCRL testing Stockfish 14. If dkappe is wrong and Stockfish did use Leela data, then by previous precedent, such as testing Allie with a Leela net, testing Nemorino 6.00 which used Stockfish data, testing BBC 1.4 which used a Stockfish net, testing Fat Fritz 2 which used Leela data, and so forth, then there would be no problem with CCRL testing Stockfish 14.

The alternative of course would be to remove Fat Fritz 2, Allie, and Nemorino 6.00 from the ratings list, but they are loath to remove any of them.
I have no opinion on whether CCRL should test SF14 or not. The current best net has used Leela training data. That’s acknowledged by the SF devs themselves. I think you are confusing TCEC’s guidelines with CCRL’s guidelines. That’s an uncomfortable place to be.
Graham Banks has specifically said that CCRL will not be testing Stockfish 14 due to the use of Leela data:
Graham Banks wrote: Fri Jul 02, 2021 10:15 am
bmp1974 wrote: Fri Jul 02, 2021 8:55 am Stockfish 14 may be released in couple of days. It is likely to have an 30-35 elo gain over SF 13.
SF14 with NNUE net trained from Lc0 training games. One gets to see best of both worlds!!
Won't that pose an issue for testing groups?
At present, we only test NNUE engines that have nets trained on their own games.
dkappe
Posts: 1632
Joined: Tue Aug 21, 2018 7:52 pm
Full name: Dietrich Kappe

Re: Stockfish 14 release round the corner

Post by dkappe »

Sopel wrote: Fri Jul 02, 2021 8:40 pm
By 2. I mean that you described an easy way to make every engine's net original, by your standards.

Also I have nothing against what you're claiming regarding to what works with NNUE training or doesn't. However I prefer following results rather than claims in my research, especially if it involves large resource requirements.
I don’t understand your logic on 2. At low nodes (800), most of the mcts engines produce the same data with the same net. They are interchangeable while the nets are not. That’s a good number of nodes for generating NNUE training data, BTW.

As for 1, a small experiment with a few 100m positions would be a good way to test.
Fat Titz by Stockfish, the engine with the bodaciously big net. Remember: size matters. If you want to learn more about this engine just google for "Fat Titz".
connor_mcmonigle
Posts: 544
Joined: Sun Sep 06, 2020 4:40 am
Full name: Connor McMonigle

Re: Stockfish 14 release round the corner

Post by connor_mcmonigle »

dkappe wrote: Fri Jul 02, 2021 8:48 pm
Sopel wrote: Fri Jul 02, 2021 8:40 pm
By 2. I mean that you described an easy way to make every engine's net original, by your standards.

Also I have nothing against what you're claiming regarding to what works with NNUE training or doesn't. However I prefer following results rather than claims in my research, especially if it involves large resource requirements.
As for 1, a small experiment with a few 100m positions would be a good way to test.
This discussion has deviated a great deal from the original discussion, haha. Anyways, 100M positions is far from representative in my opinion. What works best with a 100M position dataset is significantly different from what works best with a 100G position dataset. I believe that with increasing scale, noisier data is actually desirable. The dynamics are very complicated and not well understood in any case
dkappe
Posts: 1632
Joined: Tue Aug 21, 2018 7:52 pm
Full name: Dietrich Kappe

Re: Stockfish 14 release round the corner

Post by dkappe »

connor_mcmonigle wrote: Fri Jul 02, 2021 8:52 pm
dkappe wrote: Fri Jul 02, 2021 8:48 pm
Sopel wrote: Fri Jul 02, 2021 8:40 pm
By 2. I mean that you described an easy way to make every engine's net original, by your standards.

Also I have nothing against what you're claiming regarding to what works with NNUE training or doesn't. However I prefer following results rather than claims in my research, especially if it involves large resource requirements.
As for 1, a small experiment with a few 100m positions would be a good way to test.
This discussion has deviated a great deal from the original discussion, haha. Anyways, 100M positions is far from representative in my opinion. What works best with a 100M position dataset is significantly different from what works best with a 100G position dataset. I believe that with increasing scale, noisier data is actually desirable. The dynamics are very complicated and not well understood in any case
A fair point, but then we could never run any experiments.
Fat Titz by Stockfish, the engine with the bodaciously big net. Remember: size matters. If you want to learn more about this engine just google for "Fat Titz".
Modern Times
Posts: 3697
Joined: Thu Jun 07, 2012 11:02 pm

Re: Stockfish 14 release round the corner

Post by Modern Times »

Madeleine Birchfield wrote: Fri Jul 02, 2021 8:35 pm The interesting thing about the current argument between the usual suspects is that if either side is correct, then CCRL and other rating lists would have to test Stockfish 14. If dkappe is right and Stockfish did not use Leela data, then there would be no problem with CCRL testing Stockfish 14. If dkappe is wrong and Stockfish did use Leela data, then by previous precedent, such as testing Allie with a Leela net, testing Nemorino 6.00 which used Stockfish data, testing BBC 1.4 which used a Stockfish net, testing Fat Fritz 2 which used Leela data, and so forth, then there would be no problem with CCRL testing Stockfish 14.

The alternative of course would be to remove Fat Fritz 2, Allie, and Nemorino 6.00 from the ratings list, but they are loath to remove any of them.
Precisely why Stefan Pohl for example tests everything. You avoid all the issues around making rules, trying to establish the facts for a given situation, and interpreting them. It is a pragmatic and sensible approach.
Madeleine Birchfield
Posts: 512
Joined: Tue Sep 29, 2020 4:29 pm
Location: Dublin, Ireland
Full name: Madeleine Birchfield

Re: Stockfish 14 release round the corner

Post by Madeleine Birchfield »

Modern Times wrote: Fri Jul 02, 2021 8:56 pm Precisely why Stefan Pohl for example tests everything. You avoid all the issues around making rules, trying to establish the facts for a given situation, and interpreting them. It is a pragmatic and sensible approach.
Stefan Pohl is the only person who still regularly tests Leela nets against other engines as well. Other rating lists cater exclusively to CPU-only engines, or have long since stopped testing Leela.
Damir
Posts: 2864
Joined: Mon Feb 11, 2008 3:53 pm
Location: Denmark
Full name: Damir Desevac

Re: Stockfish 14 release round the corner

Post by Damir »

Madeleine Birchfield wrote: Fri Jul 02, 2021 8:43 pm
dkappe wrote: Fri Jul 02, 2021 8:41 pm
Madeleine Birchfield wrote: Fri Jul 02, 2021 8:35 pm The interesting thing about the current argument between the usual suspects is that if either side is correct, then CCRL and other rating lists would have to test Stockfish 14. If dkappe is right and Stockfish did not use Leela data, then there would be no problem with CCRL testing Stockfish 14. If dkappe is wrong and Stockfish did use Leela data, then by previous precedent, such as testing Allie with a Leela net, testing Nemorino 6.00 which used Stockfish data, testing BBC 1.4 which used a Stockfish net, testing Fat Fritz 2 which used Leela data, and so forth, then there would be no problem with CCRL testing Stockfish 14.

The alternative of course would be to remove Fat Fritz 2, Allie, and Nemorino 6.00 from the ratings list, but they are loath to remove any of them.
I have no opinion on whether CCRL should test SF14 or not. The current best net has used Leela training data. That’s acknowledged by the SF devs themselves. I think you are confusing TCEC’s guidelines with CCRL’s guidelines. That’s an uncomfortable place to be.
Graham Banks has specifically said that CCRL will not be testing Stockfish 14 due to the use of Leela data:
Graham Banks wrote: Fri Jul 02, 2021 10:15 am
bmp1974 wrote: Fri Jul 02, 2021 8:55 am Stockfish 14 may be released in couple of days. It is likely to have an 30-35 elo gain over SF 13.
SF14 with NNUE net trained from Lc0 training games. One gets to see best of both worlds!!
Won't that pose an issue for testing groups?
At present, we only test NNUE engines that have nets trained on their own games.
Graham is known for his hypocricy... When other engines use Stockfish data he has no problems testing them... When Stockfish want to try Leela data and network than it is unacceptable... smell like a double standards to me....
Modern Times
Posts: 3697
Joined: Thu Jun 07, 2012 11:02 pm

Re: Stockfish 14 release round the corner

Post by Modern Times »

Madeleine Birchfield wrote: Fri Jul 02, 2021 8:43 pm
Graham Banks has specifically said that CCRL will not be testing Stockfish 14 due to the use of Leela data:
I am not aware of any internal discussions in CCRL about Stockfish 14. It doesn't exist yet.

As I said in a previous thread, it is a lose-lose situation. You're damned if you do, and damned if you don't. Personally I like Stefan Pohl's approach to what he tests.

For example, how can you defend testing Stockfish 14 trained on Lc0 games, and not testing Fire 8.1NN trained using Stockfish games ?
Madeleine Birchfield
Posts: 512
Joined: Tue Sep 29, 2020 4:29 pm
Location: Dublin, Ireland
Full name: Madeleine Birchfield

Re: Stockfish 14 release round the corner

Post by Madeleine Birchfield »

Modern Times wrote: Fri Jul 02, 2021 9:09 pm I am not aware of any internal discussions in CCRL about Stockfish 14. It doesn't exist yet.
Except Stockfish 14 already exists:

https://stockfishchess.org/blog/2021/stockfish-14/