SF+NNUE reach the ceiling?

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

corres
Posts: 3657
Joined: Wed Nov 18, 2015 11:41 am
Location: hungary

SF+NNUE reach the ceiling?

Post by corres »

As we could suppose the possibilities to enhance Elo of SF+NNUE are limited because the chess knowledge of every AB engines also restricted and the reinforced learning can not give plus information to the net of NNUE only it can sharpen the "picture" what can add only limited Elo-enhancement.
peter
Posts: 3185
Joined: Sat Feb 16, 2008 7:38 am
Full name: Peter Martan

Re: SF+NNUE reach the ceiling?

Post by peter »

corres wrote: Wed Aug 26, 2020 8:25 pm As we could suppose the possibilities to enhance Elo of SF+NNUE are limited because the chess knowledge of every AB engines also restricted and the reinforced learning can not give plus information to the net of NNUE only it can sharpen the "picture" what can add only limited Elo-enhancement.
My oh my :-)

No offence meant, but how old is this technology now?
How many engines do we have, using it?
How many nets of different ways of being trained?
How many of them with .pgn- training?
How many specialized nets for certain openings?
Midgame nets for tactical search mainly?
Endgame- nets?

I see one problem in all of these possibilities only: even if hardware- time of training is much less then for LC0- nets, even if training is much better to be guided, the draw- death of computer chess will make it more and more resource- consuming to even distinguish the "overall playing strength" of engines, forks and nets elo- wise, especially as for combinations with "database knowledge", that explodes together with engine- strength of course too.

And to which purpose?

Who will pay hardware, time and manpower of developments for chess- players only, if eng-eng-match will get less and less result except draw. Guess it will be a matter of developments from the user himself for his own use and interest.
It's not so difficult to train nets of your own interest, so let's wait and see, what will come along the next month or so, if the first steps took about as long as that only too till now.
:)
Peter.
corres
Posts: 3657
Joined: Wed Nov 18, 2015 11:41 am
Location: hungary

Re: SF+NNUE reach the ceiling?

Post by corres »

peter wrote: Wed Aug 26, 2020 8:57 pm
corres wrote: Wed Aug 26, 2020 8:25 pm As we could suppose the possibilities to enhance Elo of SF+NNUE are limited because the chess knowledge of every AB engines also restricted and the reinforced learning can not give plus information to the net of NNUE only it can sharpen the "picture" what can add only limited Elo-enhancement.
My oh my :-)

No offence meant, but how old is this technology now?
How many engines do we have, using it?
How many nets of different ways of being trained?
How many of them with .pgn- training?
How many specialized nets for certain openings?
Midgame nets for tactical search mainly?
Endgame- nets?

I see one problem in all of these possibilities only: even if hardware- time of training is much less then for LC0- nets, even if training is much better to be guided, the draw- death of computer chess will make it more and more resource- consuming to even distinguish the "overall playing strength" of engines, forks and nets elo- wise, especially as for combinations with "database knowledge", that explodes together with engine- strength of course too.

And to which purpose?

Who will pay hardware, time and manpower of developments for chess- players only, if eng-eng-match will get less and less result except draw. Guess it will be a matter of developments from the user himself for his own use and interest.
It's not so difficult to train nets of your own interest, so let's wait and see, what will come along the next month or so, if the first steps took about as long as that only too till now.
The real question is "and to which purpose?"
The others are no more than optimism.
peter
Posts: 3185
Joined: Sat Feb 16, 2008 7:38 am
Full name: Peter Martan

Re: SF+NNUE reach the ceiling?

Post by peter »

corres wrote: Wed Aug 26, 2020 10:07 pm The real question is "and to which purpose?"
The others are no more than optimism.
You got me fully wrong, my posting wasn't optimistic at all as for the probability of Elo- gain to be expected as high as they are said to be in latest 2 weeks from selfplay in framework and enhanced selfplay against SF11 only. No further Elo- explosions in that amount bo be expected in near future at all, not against SF11, not against dev.- versions to come and not against LC0. And for sure not, if not even you as a chessplayer are willing to see the chances in further developments of any other sense and purpose and matter than thinking about further elo.

Against which opponents? With which hardware- TC, with which openings? Elo since quite a time in computer- chess are more and more an illusion (elosion) as for transferabiltiy between different matches, differing as for any of these terms and conditions.

What are you going to buy for your elo, if you're not even willing to work for them on your own?
By training nets on your own for you own corr.- Elo e.g.?

If you are already convinced, these 4 weeks were all to be expected at the utmost, you'll probably make a self- fulfilling prophecy for your own use of NNUE, but as well as for LC0- like nets and engines, for PUCT and MCTS and A-B-search and all of these things, that have reached such a high elo-performance nowadays.
Let it be developed for game- playing, let the engines play and watch them draw and forget about the rest. As Chrilly Donninger used to say: "Like watching the washing machine doing the laundry".

Of course just kidding only still, nevermind me being too optimistic.
:)
Peter.
mwyoung
Posts: 2727
Joined: Wed May 12, 2010 10:00 pm

Re: SF+NNUE reach the ceiling?

Post by mwyoung »

peter wrote: Wed Aug 26, 2020 10:48 pm
corres wrote: Wed Aug 26, 2020 10:07 pm The real question is "and to which purpose?"
The others are no more than optimism.
You got me fully wrong, my posting wasn't optimistic at all as for the probability of Elo- gain to be expected as high as they are said to be in latest 2 weeks from selfplay in framework and enhanced selfplay against SF11 only. No further Elo- explosions in that amount bo be expected in near future at all, not against SF11, not against dev.- versions to come and not against LC0. And for sure not, if not even you as a chessplayer are willing to see the chances in further developments of any other sense and purpose and matter than thinking about further elo.

Against which opponents? With which hardware- TC, with which openings? Elo since quite a time in computer- chess are more and more an illusion (elosion) as for transferabiltiy between different matches, differing as for any of these terms and conditions.

What are you going to buy for your elo, if you're not even willing to work for them on your own?
By training nets on your own for you own corr.- Elo e.g.?

If you are already convinced, these 4 weeks were all to be expected at the utmost, you'll probably make a self- fulfilling prophecy for your own use of NNUE, but as well as for LC0- like nets and engines, for PUCT and MCTS and A-B-search and all of these things, that have reached such a high elo-performance nowadays.
Let it be developed for game- playing, let the engines play and watch them draw and forget about the rest. As Chrilly Donninger used to say: "Like watching the washing machine doing the laundry".

Of course just kidding only still, nevermind me being too optimistic.
:)
Hardware 2950x, RTX 2080 Ti

Stockfish 220820 (NN-2257)
Lc0 26.1 (NN-J92-70)
Stockfish+NNUE PO (NN-2257)
Ethereal 12.25
Xiphos 0.6
Komodo 14

Ponder off.
TC=15m+15s
32 threads.
4 Gb hash.
6 man TB, and 10, 7 man TB.
Opening book 6 moves.
Default settings.


Code: Select all

Result:
----------------------------------------------------------------------------------------
  #  name                        games    wins   draws  losses   score    los%  elo+/-
  1. Lc0 v0.26.1                    18       3      15       0    10.5    95.8    58.5
  2. SF+NNUE PO 290720 x64 avx2     18       2      16       0    10.0    92.1    38.8
  3. Stockfish 240820               14       2      12       0     8.0    92.1    50.0
  4. Ethereal 12.25 (POPCNT)        15       2      10       3     7.0    32.7   -23.2
  5. Komodo 14 64-bit               15       0      14       1     7.0    15.9   -23.2
  6. Xiphos 0.6 NO-POPCNT           16       0      11       5     5.5     1.3  -112.3

Cross table:
----------------------------------------------------------------------------------------
  #  name                           score   games         1         2         3         4         5         6
  1. Lc0 v0.26.1                     10.5      18         x      ====       ===       1==      ===1      ===1
  2. SF+NNUE PO 290720 x64 avx2      10.0      18      ====         x      ====      1===       ===       1==
  3. Stockfish 240820                 8.0      14       ===      ====         x       =1=         =       1==
  4. Ethereal 12.25 (POPCNT)          7.0      15       0==      0===       =0=         x       ===        11
  5. Komodo 14 64-bit                 7.0      15      ===0       ===         =       ===         x      ====
  6. Xiphos 0.6 NO-POPCNT             5.5      16      ===0       0==       0==        00      ====         x

Tech:
----------------------------------------------------------------------------------------

Tech (average nodes, depths, time/m per move, others per game), counted for computing moves only, ignored moves with zero nodes:
  #  name                          nodes/m         NPS  depth/m   time/m    moves     time
  1. Lc0 v0.26.1                      975K       37483     11.9     26.0     69.5   1807.6
  2. SF+NNUE PO 290720 x64 avx2    912613K    32447710     55.5     28.1     54.7   1539.1
  3. Stockfish 240820             1053554K    37474414     51.0     28.1     57.5   1616.6
  4. Ethereal 12.25 (POPCNT)      1190601K    43104669     41.5     27.6     64.8   1789.9
  5. Komodo 14 64-bit             1036218K    37430907     45.7     27.7     66.5   1840.0
  6. Xiphos 0.6 NO-POPCNT          853822K    34469857     44.9     24.8     72.3   1789.6
     all ---                       780117K    29680730     40.3     26.9     64.2   1728.7
"The worst thing that can happen to a forum is a running wild attacking moderator(HGM) who is not corrected by the community." - Ed Schröder
But my words like silent raindrops fell. And echoed in the wells of silence.
peter
Posts: 3185
Joined: Sat Feb 16, 2008 7:38 am
Full name: Peter Martan

Re: SF+NNUE reach the ceiling?

Post by peter »

mwyoung wrote: Wed Aug 26, 2020 11:53 pm
peter wrote: Wed Aug 26, 2020 10:48 pm
corres wrote: Wed Aug 26, 2020 10:07 pm The real question is "and to which purpose?"
The others are no more than optimism.
You got me fully wrong, my posting wasn't optimistic at all as for the probability of Elo- gain to be expected as high as they are said to be in latest 2 weeks from selfplay in framework and enhanced selfplay against SF11 only. No further Elo- explosions in that amount bo be expected in near future at all, not against SF11, not against dev.- versions to come and not against LC0. And for sure not, if not even you as a chessplayer are willing to see the chances in further developments of any other sense and purpose and matter than thinking about further elo.

Against which opponents? With which hardware- TC, with which openings? Elo since quite a time in computer- chess are more and more an illusion (elosion) as for transferabiltiy between different matches, differing as for any of these terms and conditions.

What are you going to buy for your elo, if you're not even willing to work for them on your own?
By training nets on your own for you own corr.- Elo e.g.?

If you are already convinced, these 4 weeks were all to be expected at the utmost, you'll probably make a self- fulfilling prophecy for your own use of NNUE, but as well as for LC0- like nets and engines, for PUCT and MCTS and A-B-search and all of these things, that have reached such a high elo-performance nowadays.
Let it be developed for game- playing, let the engines play and watch them draw and forget about the rest. As Chrilly Donninger used to say: "Like watching the washing machine doing the laundry".

Of course just kidding only still, nevermind me being too optimistic.
:)
Hardware 2950x, RTX 2080 Ti

Stockfish 220820 (NN-2257)
Lc0 26.1 (NN-J92-70)
Stockfish+NNUE PO (NN-2257)
Ethereal 12.25
Xiphos 0.6
Komodo 14

Ponder off.
TC=15m+15s
32 threads.
4 Gb hash.
6 man TB, and 10, 7 man TB.
Opening book 6 moves.
Default settings.


Code: Select all

Result:
----------------------------------------------------------------------------------------
  #  name                        games    wins   draws  losses   score    los%  elo+/-
  1. Lc0 v0.26.1                    18       3      15       0    10.5    95.8    58.5
  2. SF+NNUE PO 290720 x64 avx2     18       2      16       0    10.0    92.1    38.8
  3. Stockfish 240820               14       2      12       0     8.0    92.1    50.0
  4. Ethereal 12.25 (POPCNT)        15       2      10       3     7.0    32.7   -23.2
  5. Komodo 14 64-bit               15       0      14       1     7.0    15.9   -23.2
  6. Xiphos 0.6 NO-POPCNT           16       0      11       5     5.5     1.3  -112.3

Cross table:
----------------------------------------------------------------------------------------
  #  name                           score   games         1         2         3         4         5         6
  1. Lc0 v0.26.1                     10.5      18         x      ====       ===       1==      ===1      ===1
  2. SF+NNUE PO 290720 x64 avx2      10.0      18      ====         x      ====      1===       ===       1==
  3. Stockfish 240820                 8.0      14       ===      ====         x       =1=         =       1==
  4. Ethereal 12.25 (POPCNT)          7.0      15       0==      0===       =0=         x       ===        11
  5. Komodo 14 64-bit                 7.0      15      ===0       ===         =       ===         x      ====
  6. Xiphos 0.6 NO-POPCNT             5.5      16      ===0       0==       0==        00      ====         x

Tech:
----------------------------------------------------------------------------------------

Tech (average nodes, depths, time/m per move, others per game), counted for computing moves only, ignored moves with zero nodes:
  #  name                          nodes/m         NPS  depth/m   time/m    moves     time
  1. Lc0 v0.26.1                      975K       37483     11.9     26.0     69.5   1807.6
  2. SF+NNUE PO 290720 x64 avx2    912613K    32447710     55.5     28.1     54.7   1539.1
  3. Stockfish 240820             1053554K    37474414     51.0     28.1     57.5   1616.6
  4. Ethereal 12.25 (POPCNT)      1190601K    43104669     41.5     27.6     64.8   1789.9
  5. Komodo 14 64-bit             1036218K    37430907     45.7     27.7     66.5   1840.0
  6. Xiphos 0.6 NO-POPCNT          853822K    34469857     44.9     24.8     72.3   1789.6
     all ---                       780117K    29680730     40.3     26.9     64.2   1728.7
Thanks for the match.

BTW here you have another one with a somewhat smaller error- bar:

https://forum.computerschach.de/cgi-bin ... #pid133952
Peter.
mwyoung
Posts: 2727
Joined: Wed May 12, 2010 10:00 pm

Re: SF+NNUE reach the ceiling?

Post by mwyoung »

peter wrote: Thu Aug 27, 2020 12:49 am
mwyoung wrote: Wed Aug 26, 2020 11:53 pm
peter wrote: Wed Aug 26, 2020 10:48 pm
corres wrote: Wed Aug 26, 2020 10:07 pm The real question is "and to which purpose?"
The others are no more than optimism.
You got me fully wrong, my posting wasn't optimistic at all as for the probability of Elo- gain to be expected as high as they are said to be in latest 2 weeks from selfplay in framework and enhanced selfplay against SF11 only. No further Elo- explosions in that amount bo be expected in near future at all, not against SF11, not against dev.- versions to come and not against LC0. And for sure not, if not even you as a chessplayer are willing to see the chances in further developments of any other sense and purpose and matter than thinking about further elo.

Against which opponents? With which hardware- TC, with which openings? Elo since quite a time in computer- chess are more and more an illusion (elosion) as for transferabiltiy between different matches, differing as for any of these terms and conditions.

What are you going to buy for your elo, if you're not even willing to work for them on your own?
By training nets on your own for you own corr.- Elo e.g.?

If you are already convinced, these 4 weeks were all to be expected at the utmost, you'll probably make a self- fulfilling prophecy for your own use of NNUE, but as well as for LC0- like nets and engines, for PUCT and MCTS and A-B-search and all of these things, that have reached such a high elo-performance nowadays.
Let it be developed for game- playing, let the engines play and watch them draw and forget about the rest. As Chrilly Donninger used to say: "Like watching the washing machine doing the laundry".

Of course just kidding only still, nevermind me being too optimistic.
:)
Hardware 2950x, RTX 2080 Ti

Stockfish 220820 (NN-2257)
Lc0 26.1 (NN-J92-70)
Stockfish+NNUE PO (NN-2257)
Ethereal 12.25
Xiphos 0.6
Komodo 14

Ponder off.
TC=15m+15s
32 threads.
4 Gb hash.
6 man TB, and 10, 7 man TB.
Opening book 6 moves.
Default settings.


Code: Select all

Result:
----------------------------------------------------------------------------------------
  #  name                        games    wins   draws  losses   score    los%  elo+/-
  1. Lc0 v0.26.1                    18       3      15       0    10.5    95.8    58.5
  2. SF+NNUE PO 290720 x64 avx2     18       2      16       0    10.0    92.1    38.8
  3. Stockfish 240820               14       2      12       0     8.0    92.1    50.0
  4. Ethereal 12.25 (POPCNT)        15       2      10       3     7.0    32.7   -23.2
  5. Komodo 14 64-bit               15       0      14       1     7.0    15.9   -23.2
  6. Xiphos 0.6 NO-POPCNT           16       0      11       5     5.5     1.3  -112.3

Cross table:
----------------------------------------------------------------------------------------
  #  name                           score   games         1         2         3         4         5         6
  1. Lc0 v0.26.1                     10.5      18         x      ====       ===       1==      ===1      ===1
  2. SF+NNUE PO 290720 x64 avx2      10.0      18      ====         x      ====      1===       ===       1==
  3. Stockfish 240820                 8.0      14       ===      ====         x       =1=         =       1==
  4. Ethereal 12.25 (POPCNT)          7.0      15       0==      0===       =0=         x       ===        11
  5. Komodo 14 64-bit                 7.0      15      ===0       ===         =       ===         x      ====
  6. Xiphos 0.6 NO-POPCNT             5.5      16      ===0       0==       0==        00      ====         x

Tech:
----------------------------------------------------------------------------------------

Tech (average nodes, depths, time/m per move, others per game), counted for computing moves only, ignored moves with zero nodes:
  #  name                          nodes/m         NPS  depth/m   time/m    moves     time
  1. Lc0 v0.26.1                      975K       37483     11.9     26.0     69.5   1807.6
  2. SF+NNUE PO 290720 x64 avx2    912613K    32447710     55.5     28.1     54.7   1539.1
  3. Stockfish 240820             1053554K    37474414     51.0     28.1     57.5   1616.6
  4. Ethereal 12.25 (POPCNT)      1190601K    43104669     41.5     27.6     64.8   1789.9
  5. Komodo 14 64-bit             1036218K    37430907     45.7     27.7     66.5   1840.0
  6. Xiphos 0.6 NO-POPCNT          853822K    34469857     44.9     24.8     72.3   1789.6
     all ---                       780117K    29680730     40.3     26.9     64.2   1728.7
Thanks for the match.

BTW here you have another one with a somewhat smaller error- bar:

https://forum.computerschach.de/cgi-bin ... #pid133952
This is the current match still being played. I have many many matches and games. But this match is typical for nnue at longer time controls.
"The worst thing that can happen to a forum is a running wild attacking moderator(HGM) who is not corrected by the community." - Ed Schröder
But my words like silent raindrops fell. And echoed in the wells of silence.
peter
Posts: 3185
Joined: Sat Feb 16, 2008 7:38 am
Full name: Peter Martan

Re: SF+NNUE reach the ceiling?

Post by peter »

mwyoung wrote: Thu Aug 27, 2020 12:54 am
peter wrote: Thu Aug 27, 2020 12:49 am
mwyoung wrote: Wed Aug 26, 2020 11:53 pm
peter wrote: Wed Aug 26, 2020 10:48 pm
corres wrote: Wed Aug 26, 2020 10:07 pm The real question is "and to which purpose?"
The others are no more than optimism.
You got me fully wrong, my posting wasn't optimistic at all as for the probability of Elo- gain to be expected as high as they are said to be in latest 2 weeks from selfplay in framework and enhanced selfplay against SF11 only. No further Elo- explosions in that amount bo be expected in near future at all, not against SF11, not against dev.- versions to come and not against LC0. And for sure not, if not even you as a chessplayer are willing to see the chances in further developments of any other sense and purpose and matter than thinking about further elo.

Against which opponents? With which hardware- TC, with which openings? Elo since quite a time in computer- chess are more and more an illusion (elosion) as for transferabiltiy between different matches, differing as for any of these terms and conditions.

What are you going to buy for your elo, if you're not even willing to work for them on your own?
By training nets on your own for you own corr.- Elo e.g.?

If you are already convinced, these 4 weeks were all to be expected at the utmost, you'll probably make a self- fulfilling prophecy for your own use of NNUE, but as well as for LC0- like nets and engines, for PUCT and MCTS and A-B-search and all of these things, that have reached such a high elo-performance nowadays.
Let it be developed for game- playing, let the engines play and watch them draw and forget about the rest. As Chrilly Donninger used to say: "Like watching the washing machine doing the laundry".

Of course just kidding only still, nevermind me being too optimistic.
:)
Hardware 2950x, RTX 2080 Ti

Stockfish 220820 (NN-2257)
Lc0 26.1 (NN-J92-70)
Stockfish+NNUE PO (NN-2257)
Ethereal 12.25
Xiphos 0.6
Komodo 14

Ponder off.
TC=15m+15s
32 threads.
4 Gb hash.
6 man TB, and 10, 7 man TB.
Opening book 6 moves.
Default settings.


Code: Select all

Result:
----------------------------------------------------------------------------------------
  #  name                        games    wins   draws  losses   score    los%  elo+/-
  1. Lc0 v0.26.1                    18       3      15       0    10.5    95.8    58.5
  2. SF+NNUE PO 290720 x64 avx2     18       2      16       0    10.0    92.1    38.8
  3. Stockfish 240820               14       2      12       0     8.0    92.1    50.0
  4. Ethereal 12.25 (POPCNT)        15       2      10       3     7.0    32.7   -23.2
  5. Komodo 14 64-bit               15       0      14       1     7.0    15.9   -23.2
  6. Xiphos 0.6 NO-POPCNT           16       0      11       5     5.5     1.3  -112.3

Cross table:
----------------------------------------------------------------------------------------
  #  name                           score   games         1         2         3         4         5         6
  1. Lc0 v0.26.1                     10.5      18         x      ====       ===       1==      ===1      ===1
  2. SF+NNUE PO 290720 x64 avx2      10.0      18      ====         x      ====      1===       ===       1==
  3. Stockfish 240820                 8.0      14       ===      ====         x       =1=         =       1==
  4. Ethereal 12.25 (POPCNT)          7.0      15       0==      0===       =0=         x       ===        11
  5. Komodo 14 64-bit                 7.0      15      ===0       ===         =       ===         x      ====
  6. Xiphos 0.6 NO-POPCNT             5.5      16      ===0       0==       0==        00      ====         x

Tech:
----------------------------------------------------------------------------------------

Tech (average nodes, depths, time/m per move, others per game), counted for computing moves only, ignored moves with zero nodes:
  #  name                          nodes/m         NPS  depth/m   time/m    moves     time
  1. Lc0 v0.26.1                      975K       37483     11.9     26.0     69.5   1807.6
  2. SF+NNUE PO 290720 x64 avx2    912613K    32447710     55.5     28.1     54.7   1539.1
  3. Stockfish 240820             1053554K    37474414     51.0     28.1     57.5   1616.6
  4. Ethereal 12.25 (POPCNT)      1190601K    43104669     41.5     27.6     64.8   1789.9
  5. Komodo 14 64-bit             1036218K    37430907     45.7     27.7     66.5   1840.0
  6. Xiphos 0.6 NO-POPCNT          853822K    34469857     44.9     24.8     72.3   1789.6
     all ---                       780117K    29680730     40.3     26.9     64.2   1728.7
Thanks for the match.

BTW here you have another one with a somewhat smaller error- bar:

https://forum.computerschach.de/cgi-bin ... #pid133952
This is the current match still being played. I have many many matches and games. But this match is typical for nnue at longer time controls.
So this is for Andreas Strangmüller's match too, and it's only one of many running ones with different TCs. Typical for really longer time controls is, that you don't get significant amounts of games before versions of engines and nets have changed some times in meantime.
:)
Maybe you didn't see the link at the end of the posting I gave the link to.

http://www.fastgm.de/

But what we are talking about now is exactly, what I meant by "Elosion".
Of course you get results of more fortune with less discrimination of the engines getting nearer to each others in playing strength. And you get less games in same time, so more error and bias additional. Ponder off is necessary, if you want to let LC0 and SF play on same machine, to get amounts of games with high hardware- time at all, yet it's kind of bias too, isn't it? I can't prove, that SF is better with ponder on than LC0 is, but you can't prove the opposite neither, can you?

Which one was the 6move- book you used?
Just because that's another one big point of bias, exchanging draw- rate against performance nowadays vice versa.

You can take openings like the ones Jeroen Noomen gives in great collections for TCEC, thrilling games, really low draw- rate, especially for that hardware- TC there. Yet what you get with such draw- killers is just that: 1:1- pairings as results of openings won two times for the same side each. Lowers the draw- rate, but the performance of the single one engine too, so what? Error- bar doesn't get smaller, on the contrary, less draws with same performance is higher error- bar then more draws and same performance.

Doesn't matter for TCEC, it's a match for the fun of it, for a rating- list with statistical significance, it's not to be compared to.
Last edited by peter on Thu Aug 27, 2020 1:31 am, edited 1 time in total.
Peter.
mwyoung
Posts: 2727
Joined: Wed May 12, 2010 10:00 pm

Re: SF+NNUE reach the ceiling?

Post by mwyoung »

peter wrote: Thu Aug 27, 2020 1:09 am
mwyoung wrote: Thu Aug 27, 2020 12:54 am
peter wrote: Thu Aug 27, 2020 12:49 am
mwyoung wrote: Wed Aug 26, 2020 11:53 pm
peter wrote: Wed Aug 26, 2020 10:48 pm
corres wrote: Wed Aug 26, 2020 10:07 pm The real question is "and to which purpose?"
The others are no more than optimism.
You got me fully wrong, my posting wasn't optimistic at all as for the probability of Elo- gain to be expected as high as they are said to be in latest 2 weeks from selfplay in framework and enhanced selfplay against SF11 only. No further Elo- explosions in that amount bo be expected in near future at all, not against SF11, not against dev.- versions to come and not against LC0. And for sure not, if not even you as a chessplayer are willing to see the chances in further developments of any other sense and purpose and matter than thinking about further elo.

Against which opponents? With which hardware- TC, with which openings? Elo since quite a time in computer- chess are more and more an illusion (elosion) as for transferabiltiy between different matches, differing as for any of these terms and conditions.

What are you going to buy for your elo, if you're not even willing to work for them on your own?
By training nets on your own for you own corr.- Elo e.g.?

If you are already convinced, these 4 weeks were all to be expected at the utmost, you'll probably make a self- fulfilling prophecy for your own use of NNUE, but as well as for LC0- like nets and engines, for PUCT and MCTS and A-B-search and all of these things, that have reached such a high elo-performance nowadays.
Let it be developed for game- playing, let the engines play and watch them draw and forget about the rest. As Chrilly Donninger used to say: "Like watching the washing machine doing the laundry".

Of course just kidding only still, nevermind me being too optimistic.
:)
Hardware 2950x, RTX 2080 Ti

Stockfish 220820 (NN-2257)
Lc0 26.1 (NN-J92-70)
Stockfish+NNUE PO (NN-2257)
Ethereal 12.25
Xiphos 0.6
Komodo 14

Ponder off.
TC=15m+15s
32 threads.
4 Gb hash.
6 man TB, and 10, 7 man TB.
Opening book 6 moves.
Default settings.


Code: Select all

Result:
----------------------------------------------------------------------------------------
  #  name                        games    wins   draws  losses   score    los%  elo+/-
  1. Lc0 v0.26.1                    18       3      15       0    10.5    95.8    58.5
  2. SF+NNUE PO 290720 x64 avx2     18       2      16       0    10.0    92.1    38.8
  3. Stockfish 240820               14       2      12       0     8.0    92.1    50.0
  4. Ethereal 12.25 (POPCNT)        15       2      10       3     7.0    32.7   -23.2
  5. Komodo 14 64-bit               15       0      14       1     7.0    15.9   -23.2
  6. Xiphos 0.6 NO-POPCNT           16       0      11       5     5.5     1.3  -112.3

Cross table:
----------------------------------------------------------------------------------------
  #  name                           score   games         1         2         3         4         5         6
  1. Lc0 v0.26.1                     10.5      18         x      ====       ===       1==      ===1      ===1
  2. SF+NNUE PO 290720 x64 avx2      10.0      18      ====         x      ====      1===       ===       1==
  3. Stockfish 240820                 8.0      14       ===      ====         x       =1=         =       1==
  4. Ethereal 12.25 (POPCNT)          7.0      15       0==      0===       =0=         x       ===        11
  5. Komodo 14 64-bit                 7.0      15      ===0       ===         =       ===         x      ====
  6. Xiphos 0.6 NO-POPCNT             5.5      16      ===0       0==       0==        00      ====         x

Tech:
----------------------------------------------------------------------------------------

Tech (average nodes, depths, time/m per move, others per game), counted for computing moves only, ignored moves with zero nodes:
  #  name                          nodes/m         NPS  depth/m   time/m    moves     time
  1. Lc0 v0.26.1                      975K       37483     11.9     26.0     69.5   1807.6
  2. SF+NNUE PO 290720 x64 avx2    912613K    32447710     55.5     28.1     54.7   1539.1
  3. Stockfish 240820             1053554K    37474414     51.0     28.1     57.5   1616.6
  4. Ethereal 12.25 (POPCNT)      1190601K    43104669     41.5     27.6     64.8   1789.9
  5. Komodo 14 64-bit             1036218K    37430907     45.7     27.7     66.5   1840.0
  6. Xiphos 0.6 NO-POPCNT          853822K    34469857     44.9     24.8     72.3   1789.6
     all ---                       780117K    29680730     40.3     26.9     64.2   1728.7
Thanks for the match.

BTW here you have another one with a somewhat smaller error- bar:

https://forum.computerschach.de/cgi-bin ... #pid133952
This is the current match still being played. I have many many matches and games. But this match is typical for nnue at longer time controls.
So this is for Andreas Strangmüller's match too, and it's only one of many running ones with different TCs. Typical for really longer time controls is, that you don't get significant amounts of games before versions of engines and nets have changed some times in meantime.
:)
Maybe you didn't see the link at the end of the posting I gave the link to.

http://www.fastgm.de/

But what we are talking about now is exactly, what I meant by "Elosion".

Which one was the 6move- book you used?
Just because it's another one big point of draw- rate against performance nowadays.
You can take openings like the ones Jeroen Noomen gives in great collections for TCEC, really low draw- rate, especially for that hardware- TC there. Yet what you get with such draw- killers is just that: 1:1- pairings as results of openings won two times for the same side each. Lowers the draw- rate, but the performance of the single one engine too, so what? Error- bar doesn't get smaller, on the contrary, less draws with same performance is higher error- bar then more draws and same performance.
Doesn't matter for TCEC, it's a match for the fun of it, for a rating- list with statistical significance, it's not to be compared to.
It is a book of elite GM games Played to 6 moves. Standard stuff for testing. If the GMs played it. It is in the book.
"The worst thing that can happen to a forum is a running wild attacking moderator(HGM) who is not corrected by the community." - Ed Schröder
But my words like silent raindrops fell. And echoed in the wells of silence.
peter
Posts: 3185
Joined: Sat Feb 16, 2008 7:38 am
Full name: Peter Martan

Re: SF+NNUE reach the ceiling?

Post by peter »

mwyoung wrote: Thu Aug 27, 2020 1:28 am It is a book of elite GM games Played to 6 moves. Standard stuff for testing. If the GMs played it. It is in the book.
There isn't any standard stuff for testing if it comes to openings nowadays at all.
If you give 6 moves of all the moves GM played, with so little amounts of games, chance is big, there are some of the (probably repeated with alternate colours) positions better for LC0 and before the ones better for SF come along, match is already over. (Of course that could be the case in advantage for SF as well.)

Who tells you, LC0 doesn't like GM- moves more (or less) than SF NNUE does?
Why exactly 6 moves?
Why not 7, 8, 9, 10, 5, 4, 3, 2?
Bookless games would be interesting even more, to see, how the engines really succeed with openings of their own, isn't it?
Who cares for doublets, if it's for counting the points only anyhow?
:)
Still just to give some more or less provocative thoughts about Elo in computerchess in modern times.
Peter.