Progress of Stockfish in 6 days

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

lkaufman
Posts: 5960
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA

Re: Progress of Stockfish in 6 days

Post by lkaufman »

Laskos wrote: Thu Aug 13, 2020 9:53 pm
lkaufman wrote: Thu Aug 13, 2020 8:33 pm
JJJ wrote: Thu Aug 13, 2020 4:13 pm
lkaufman wrote: Thu Aug 13, 2020 7:22 am
Jouni wrote: Wed Aug 12, 2020 9:36 pm Yes SF NNUE is equal to quadruple your CPU cores for free. Incredible :!: :!: .
I actually got a result that SFNNUE (a couple days ago) on one thread beat Stockfish 11 on seven threads, at 2' + 1", by 90 to 80! So you may be understating it!
Maybe its time to test Stockfish NNE again at knight handicap and see if it is stronger than any other program !
I've already done some tests with SFNNUE giving handicaps to other engines (and to myself!). Unfortunately it can't be tested at knight odds because it switches to normal SF mode when down (or up) about 3 pawns or more in material+piece location, with a knight counting as about 4 pawns. I ran one test with it giving pawn odds (c2,d2,e2 rotating) to SF 11 and it drew four, lost six. Against weaker engines under CCRL blitz conditions I found that around CCRL 3050 blitz is a fair opponent for it at odds of two pawns and move (remove b7 or c7 and f7 or g7, so four options). I've also tried playing it this way and also at "pawn and 3 moves" (remove f7, play e4 and d4, WTM) and so far I have only losses. In general it seems to be able to give handicaps to engines roughly 100 elo or so higher than those that SF11 can give the same handicaps to.
I had results that NNUE plays badly unfamiliar chess like chess variants, and I assumed that it plays badly handicaps too. Not terribly bad, but underperforming compared to its strength in regular chess.

It makes sense that an NN won't be as strong starting from unusual positions than from those it is trained on. But as far as I can tell, it is still vastly stronger than normal SF at giving handicaps of a pawn or two, perhaps just a bit weaker than what you would get by adding the elo gain from SF11 to the opposing engine. I did some tests at knight odds on the version just before NN was disabled when down 3 pawns, and it scored better than SF11, but not a lot better. There just aren't good moves to find down a piece.
Komodo rules!
lkaufman
Posts: 5960
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA

Re: Progress of Stockfish in 6 days

Post by lkaufman »

mwyoung wrote: Thu Aug 13, 2020 9:54 pm
lkaufman wrote: Thu Aug 13, 2020 7:22 am
Jouni wrote: Wed Aug 12, 2020 9:36 pm Yes SF NNUE is equal to quadruple your CPU cores for free. Incredible :!: :!: .
I actually got a result that SFNNUE (a couple days ago) on one thread beat Stockfish 11 on seven threads, at 2' + 1", by 90 to 80! So you may be understating it!
I also achieved this kind of results. But I hope to God you don't think it can achieve this results in with more threads, and time. Because it does not, it is only marginally better then the best standard Stockfish, and Lc0. The strongest program yes, crushing the other engines no!
Do you have results that show that normal Stockfish with say 64 threads can beat current SFNNUE on 16 threads at some time control (or any four to one ratio of threads at any time limit), or do you just mean that there are a lot more draws with longer time controls and more threads when playing on equal threads?
Komodo rules!
User avatar
MikeB
Posts: 4889
Joined: Thu Mar 09, 2006 6:34 am
Location: Pen Argyl, Pennsylvania

Re: Progress of Stockfish in 6 days

Post by MikeB »

Final - mich closer than I expected, 5.7 Elo spread from #1 to #4

Code: Select all

Games: 30000
Threads: 1
Hash: 256

Current date : time (EDST)
Date: 08/13/20 : 16:54:46

Projected-> Time: 16h:23m:17s
     Run -> Time: 15h:22m:39s

30000 game(s) loaded
Rank Name                 Rating   Δ     +    -     #     Σ    Σ%     W    L    D   W%    =%   OppR
---------------------------------------------------------------------------------------------------------

   1 Stockfish-XIr4-2257   3515   0.0    4    4 12000 6319.5  52.7 2625 1986 7389  21.9  61.6  3496
   2 cur-dev-stockfish     3513   1.9    4    4 12000 6279.5  52.3 2574 2015 7411  21.4  61.8  3497
   3 Bluefish-2257         3513   0.0    4    4 12000 6289.0  52.4 2507 1929 7564  20.9  63.0  3497
   4 Honey-2257            3509   3.8    4    4 12000 6203.5  51.7 2416 2009 7575  20.1  63.1  3498
   5 Black-Diamond-2257    3449  59.9    4    4 12000 4908.5  40.9 1405 3588 7007  11.7  58.4  3513
---------------------------------------------------------------------------------------------------------

  Δ = delta from the next higher rated opponent
  # = number of games played
  Σ = total score, 1 point for win, 1/2 point for draw

LOS:
                     St cu Bl Ho Bl
Stockfish-XIr4-2257     78 79 99100
cur-dev-stockfish    21    50 94100
Bluefish-2257        20 49    94100
Honey-2257            0  5  5   100
Black-Diamond-2257    0  0  0  0

#########################################################################################################
###                                                End                                                ###
#########################################################################################################
Image
Cornfed
Posts: 511
Joined: Sun Apr 26, 2020 11:40 pm
Full name: Brian D. Smith

Re: Progress of Stockfish in 6 days

Post by Cornfed »

lkaufman wrote: Thu Aug 13, 2020 10:21 pm I did some tests at knight odds on the version just before NN was disabled when down 3 pawns, and it scored better than SF11, but not a lot better. There just aren't good moves to find down a piece.
The usefulness of an engine when down 'a piece' is to find the moves that give the opponent the most chances to go wrong...a more fine line to walk toward realizing that advantage.

This is why Komodo MTCS intrigues me...if it could be refined even more, it would be great for preparing lines to play OTB which contain risk. Just my two cents and I know off topic. :-)
mwyoung
Posts: 2727
Joined: Wed May 12, 2010 10:00 pm

Re: Progress of Stockfish in 6 days

Post by mwyoung »

lkaufman wrote: Thu Aug 13, 2020 10:51 pm
mwyoung wrote: Thu Aug 13, 2020 9:54 pm
lkaufman wrote: Thu Aug 13, 2020 7:22 am
Jouni wrote: Wed Aug 12, 2020 9:36 pm Yes SF NNUE is equal to quadruple your CPU cores for free. Incredible :!: :!: .
I actually got a result that SFNNUE (a couple days ago) on one thread beat Stockfish 11 on seven threads, at 2' + 1", by 90 to 80! So you may be understating it!
I also achieved this kind of results. But I hope to God you don't think it can achieve this results in with more threads, and time. Because it does not, it is only marginally better then the best standard Stockfish, and Lc0. The strongest program yes, crushing the other engines no!
Do you have results that show that normal Stockfish with say 64 threads can beat current SFNNUE on 16 threads at some time control (or any four to one ratio of threads at any time limit), or do you just mean that there are a lot more draws with longer time controls and more threads when playing on equal threads?
We have a match request for Stockfish NNUE vs Lc0 on the Match and Tournament page.

"How does perform SF-NNUE vs Lc0 ?
Post by Vinvin » Tue Aug 11, 2020 10:09 pm

I saw very few matches between this 2 engines.

Conditions like :
3min+2sec
200 games
book : short lines (where Lc0 was the best over A/B engines)

RTX 2080 Ti for Lc0
16 cores for SF-NNUE (latest exe + latest NN file)

Could someone run a match like this ?"

Why Speculate this is a easy test to run! Lets see how the best 32 threads Stockfish NNUE performs at 3m+2s vs the best Lc0 26.1 on a 2080 ti.

"I actually got a result that SFNNUE (a couple days ago) on one thread beat Stockfish 11 on seven threads, at 2' + 1", by 90 to 80! So you may be understating it!"

Lets see how close we get to your Elo rating....

Live Stream:
Last edited by mwyoung on Fri Aug 14, 2020 12:15 am, edited 2 times in total.
"The worst thing that can happen to a forum is a running wild attacking moderator(HGM) who is not corrected by the community." - Ed Schröder
But my words like silent raindrops fell. And echoed in the wells of silence.
carldaman
Posts: 2283
Joined: Sat Jun 02, 2012 2:13 am

Re: Progress of Stockfish in 6 days

Post by carldaman »

@cornfed
I agree - being down material is also a (big) part of chess, and playing well under such circumstances is a goal worth pursuing, although highly neglected, except maybe for Komodo's authors' efforts.
lkaufman
Posts: 5960
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA

Re: Progress of Stockfish in 6 days

Post by lkaufman »

mwyoung wrote: Fri Aug 14, 2020 12:03 am
lkaufman wrote: Thu Aug 13, 2020 10:51 pm
mwyoung wrote: Thu Aug 13, 2020 9:54 pm
lkaufman wrote: Thu Aug 13, 2020 7:22 am
Jouni wrote: Wed Aug 12, 2020 9:36 pm Yes SF NNUE is equal to quadruple your CPU cores for free. Incredible :!: :!: .
I actually got a result that SFNNUE (a couple days ago) on one thread beat Stockfish 11 on seven threads, at 2' + 1", by 90 to 80! So you may be understating it!
I also achieved this kind of results. But I hope to God you don't think it can achieve this results in with more threads, and time. Because it does not, it is only marginally better then the best standard Stockfish, and Lc0. The strongest program yes, crushing the other engines no!
Do you have results that show that normal Stockfish with say 64 threads can beat current SFNNUE on 16 threads at some time control (or any four to one ratio of threads at any time limit), or do you just mean that there are a lot more draws with longer time controls and more threads when playing on equal threads?
We have a match request for Stockfish NNUE vs Lc0 on the Match and Tournament page.

"How does perform SF-NNUE vs Lc0 ?
Post by Vinvin » Tue Aug 11, 2020 10:09 pm

I saw very few matches between this 2 engines.

Conditions like :
3min+2sec
200 games
book : short lines (where Lc0 was the best over A/B engines)

RTX 2080 Ti for Lc0
16 cores for SF-NNUE (latest exe + latest NN file)

Could someone run a match like this ?"

Why Speculate this is a easy test to run! Lets see how the best 32 threads Stockfish NNUE performs at 3m+2s vs the best Lc0 26.1 on a 2080 ti.

"I actually got a result that SFNNUE (a couple days ago) on one thread beat Stockfish 11 on seven threads, at 2' + 1", by 90 to 80! So you may be understating it!"

Lets see how close we get to your Elo rating....

Live Stream:
So far all draws. While I do appreciate this match, it is not very relevant to the question of whether SFNNUE can give (for example) 64 to 16 thread handicap to SF11. If it can give four threads to one successfully (as appears to be the case), if it can't give 64 to 16 then it may indicate poor scaling. The elo gap with equal threads will of course decline with more threads and time, that has nothing to do with the specific engine.
Komodo rules!
mwyoung
Posts: 2727
Joined: Wed May 12, 2010 10:00 pm

Re: Progress of Stockfish in 6 days

Post by mwyoung »

lkaufman wrote: Fri Aug 14, 2020 1:20 am
mwyoung wrote: Fri Aug 14, 2020 12:03 am
lkaufman wrote: Thu Aug 13, 2020 10:51 pm
mwyoung wrote: Thu Aug 13, 2020 9:54 pm
lkaufman wrote: Thu Aug 13, 2020 7:22 am
Jouni wrote: Wed Aug 12, 2020 9:36 pm Yes SF NNUE is equal to quadruple your CPU cores for free. Incredible :!: :!: .
I actually got a result that SFNNUE (a couple days ago) on one thread beat Stockfish 11 on seven threads, at 2' + 1", by 90 to 80! So you may be understating it!
I also achieved this kind of results. But I hope to God you don't think it can achieve this results in with more threads, and time. Because it does not, it is only marginally better then the best standard Stockfish, and Lc0. The strongest program yes, crushing the other engines no!
Do you have results that show that normal Stockfish with say 64 threads can beat current SFNNUE on 16 threads at some time control (or any four to one ratio of threads at any time limit), or do you just mean that there are a lot more draws with longer time controls and more threads when playing on equal threads?
We have a match request for Stockfish NNUE vs Lc0 on the Match and Tournament page.

"How does perform SF-NNUE vs Lc0 ?
Post by Vinvin » Tue Aug 11, 2020 10:09 pm

I saw very few matches between this 2 engines.

Conditions like :
3min+2sec
200 games
book : short lines (where Lc0 was the best over A/B engines)

RTX 2080 Ti for Lc0
16 cores for SF-NNUE (latest exe + latest NN file)

Could someone run a match like this ?"

Why Speculate this is a easy test to run! Lets see how the best 32 threads Stockfish NNUE performs at 3m+2s vs the best Lc0 26.1 on a 2080 ti.

"I actually got a result that SFNNUE (a couple days ago) on one thread beat Stockfish 11 on seven threads, at 2' + 1", by 90 to 80! So you may be understating it!"

Lets see how close we get to your Elo rating....

Live Stream:
So far all draws. While I do appreciate this match, it is not very relevant to the question of whether SFNNUE can give (for example) 64 to 16 thread handicap to SF11. If it can give four threads to one successfully (as appears to be the case), if it can't give 64 to 16 then it may indicate poor scaling. The elo gap with equal threads will of course decline with more threads and time, that has nothing to do with the specific engine.
Lets see what it can do here.

And stop the hype, you should know better without testing.

"Yes SF NNUE is equal to quadruple your CPU cores for free. Incredible :!: :!: ." :lol:

"I actually got a result that SFNNUE (a couple days ago) on one thread beat Stockfish 11 on seven threads, at 2' + 1", by 90 to 80! So you may be understating it!"

As testers we should give the whole truth, not cherry picked results!
"The worst thing that can happen to a forum is a running wild attacking moderator(HGM) who is not corrected by the community." - Ed Schröder
But my words like silent raindrops fell. And echoed in the wells of silence.
lkaufman
Posts: 5960
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA

Re: Progress of Stockfish in 6 days

Post by lkaufman »

mwyoung wrote: Fri Aug 14, 2020 1:31 am
lkaufman wrote: Fri Aug 14, 2020 1:20 am
mwyoung wrote: Fri Aug 14, 2020 12:03 am
lkaufman wrote: Thu Aug 13, 2020 10:51 pm
mwyoung wrote: Thu Aug 13, 2020 9:54 pm
lkaufman wrote: Thu Aug 13, 2020 7:22 am
Jouni wrote: Wed Aug 12, 2020 9:36 pm Yes SF NNUE is equal to quadruple your CPU cores for free. Incredible :!: :!: .
I actually got a result that SFNNUE (a couple days ago) on one thread beat Stockfish 11 on seven threads, at 2' + 1", by 90 to 80! So you may be understating it!
I also achieved this kind of results. But I hope to God you don't think it can achieve this results in with more threads, and time. Because it does not, it is only marginally better then the best standard Stockfish, and Lc0. The strongest program yes, crushing the other engines no!
Do you have results that show that normal Stockfish with say 64 threads can beat current SFNNUE on 16 threads at some time control (or any four to one ratio of threads at any time limit), or do you just mean that there are a lot more draws with longer time controls and more threads when playing on equal threads?
We have a match request for Stockfish NNUE vs Lc0 on the Match and Tournament page.

"How does perform SF-NNUE vs Lc0 ?
Post by Vinvin » Tue Aug 11, 2020 10:09 pm

I saw very few matches between this 2 engines.

Conditions like :
3min+2sec
200 games
book : short lines (where Lc0 was the best over A/B engines)

RTX 2080 Ti for Lc0
16 cores for SF-NNUE (latest exe + latest NN file)

Could someone run a match like this ?"

Why Speculate this is a easy test to run! Lets see how the best 32 threads Stockfish NNUE performs at 3m+2s vs the best Lc0 26.1 on a 2080 ti.

"I actually got a result that SFNNUE (a couple days ago) on one thread beat Stockfish 11 on seven threads, at 2' + 1", by 90 to 80! So you may be understating it!"

Lets see how close we get to your Elo rating....

Live Stream:
So far all draws. While I do appreciate this match, it is not very relevant to the question of whether SFNNUE can give (for example) 64 to 16 thread handicap to SF11. If it can give four threads to one successfully (as appears to be the case), if it can't give 64 to 16 then it may indicate poor scaling. The elo gap with equal threads will of course decline with more threads and time, that has nothing to do with the specific engine.
Lets see what it can do here.

And stop the hype, you should know better without testing.

"Yes SF NNUE is equal to quadruple your CPU cores for free. Incredible :!: :!: ." :lol:

"I actually got a result that SFNNUE (a couple days ago) on one thread beat Stockfish 11 on seven threads, at 2' + 1", by 90 to 80! So you may be understating it!"

As testers we should give the whole truth, not cherry picked results!
I agree, but the only way to prove that "NNUE is equal to quadruple your CPU cores for free" is false or misleading is to run a test with many threads that shows a different result. My hunch is that the statement will turn out to be pretty much correct, no matter how many threads or how long the time limit. But of course I could be wrong.
Komodo rules!
mwyoung
Posts: 2727
Joined: Wed May 12, 2010 10:00 pm

Re: Progress of Stockfish in 6 days

Post by mwyoung »

lkaufman wrote: Fri Aug 14, 2020 2:13 am
mwyoung wrote: Fri Aug 14, 2020 1:31 am
lkaufman wrote: Fri Aug 14, 2020 1:20 am
mwyoung wrote: Fri Aug 14, 2020 12:03 am
lkaufman wrote: Thu Aug 13, 2020 10:51 pm
mwyoung wrote: Thu Aug 13, 2020 9:54 pm
lkaufman wrote: Thu Aug 13, 2020 7:22 am
Jouni wrote: Wed Aug 12, 2020 9:36 pm Yes SF NNUE is equal to quadruple your CPU cores for free. Incredible :!: :!: .
I actually got a result that SFNNUE (a couple days ago) on one thread beat Stockfish 11 on seven threads, at 2' + 1", by 90 to 80! So you may be understating it!
I also achieved this kind of results. But I hope to God you don't think it can achieve this results in with more threads, and time. Because it does not, it is only marginally better then the best standard Stockfish, and Lc0. The strongest program yes, crushing the other engines no!
Do you have results that show that normal Stockfish with say 64 threads can beat current SFNNUE on 16 threads at some time control (or any four to one ratio of threads at any time limit), or do you just mean that there are a lot more draws with longer time controls and more threads when playing on equal threads?
We have a match request for Stockfish NNUE vs Lc0 on the Match and Tournament page.

"How does perform SF-NNUE vs Lc0 ?
Post by Vinvin » Tue Aug 11, 2020 10:09 pm

I saw very few matches between this 2 engines.

Conditions like :
3min+2sec
200 games
book : short lines (where Lc0 was the best over A/B engines)

RTX 2080 Ti for Lc0
16 cores for SF-NNUE (latest exe + latest NN file)

Could someone run a match like this ?"

Why Speculate this is a easy test to run! Lets see how the best 32 threads Stockfish NNUE performs at 3m+2s vs the best Lc0 26.1 on a 2080 ti.

"I actually got a result that SFNNUE (a couple days ago) on one thread beat Stockfish 11 on seven threads, at 2' + 1", by 90 to 80! So you may be understating it!"

Lets see how close we get to your Elo rating....

Live Stream:
So far all draws. While I do appreciate this match, it is not very relevant to the question of whether SFNNUE can give (for example) 64 to 16 thread handicap to SF11. If it can give four threads to one successfully (as appears to be the case), if it can't give 64 to 16 then it may indicate poor scaling. The elo gap with equal threads will of course decline with more threads and time, that has nothing to do with the specific engine.
Lets see what it can do here.

And stop the hype, you should know better without testing.

"Yes SF NNUE is equal to quadruple your CPU cores for free. Incredible :!: :!: ." :lol:

"I actually got a result that SFNNUE (a couple days ago) on one thread beat Stockfish 11 on seven threads, at 2' + 1", by 90 to 80! So you may be understating it!"

As testers we should give the whole truth, not cherry picked results!
I agree, but the only way to prove that "NNUE is equal to quadruple your CPU cores for free" is false or misleading is to run a test with many threads that shows a different result. My hunch is that the statement will turn out to be pretty much correct, no matter how many threads or how long the time limit. But of course I could be wrong.
Did you read what you just posted. It is called Hype. You made the claim you should prove it!

Stockfish NNUE = 4x the CPU power for free. So I maybe understating the hype!" :lol:
"The worst thing that can happen to a forum is a running wild attacking moderator(HGM) who is not corrected by the community." - Ed Schröder
But my words like silent raindrops fell. And echoed in the wells of silence.