Stockfish 11 at 120k nodes per move

Laskos · Post by **Laskos** » Fri Jan 17, 2020 11:19 pm

Alayan wrote: ↑Fri Jan 17, 2020 10:35 pm Thank you Larry for your interesting input.

To complete, my initial post, I ran SF11 @ 120knpm vs SF11 @ 360knpm. It appears that by eyeballing the doubling at 200 elo in that npm range I overestimated it somewhat. My data for SF from may is 180 elo from 130k to 260k and 130 elo from 260k to 520k.
Code: Select all
Score of Stockfish 11 at 120knpm vs Stockfish 11 at 360knpm: 37 - 643 - 320  [0.197] 1000
Elo difference: -244.10 +/- 19.05
My main point stands, though.

Raphexon wrote: ↑Fri Jan 17, 2020 5:44 pm So this is basically some empirical evidence that my claim of SF at 100 n/s = GM level was right.
With +-2400 being at the low end of GM level.
No, that's wrong.

The strength loss is dramatic when reducing node count that much. Just see my graph at the beginning, at such low node counts a doubling makes a massive difference. SF at 15K nodes per move is in the ballpark of 800 elo weaker than SF at 300K nodes per move.

Stockfish's development is focused on improving strength in the 200K+ nodes per move range, and advanced search tricks will never be able to help when you have only a few thousands nodes per move total. So I can predict that SF will never ever reach the GM level at 100n/s.

Don't translate 1:1 computer self-Elos to computer-Human Elos. My claim is that SF at 5 kn/s at tournament time control is at least 2700 FIDE Elo human level. That is if they didn't screw up something in the range of 200k - 1,000k nodes per move, but I doubt they did it, as 10 + 0.1 (one core) is the standard testing, and that is exactly in this range.

lkaufman · Post by **lkaufman** » Fri Jan 17, 2020 11:50 pm

Laskos wrote: ↑Fri Jan 17, 2020 11:19 pm
Alayan wrote: ↑Fri Jan 17, 2020 10:35 pm Thank you Larry for your interesting input.

To complete, my initial post, I ran SF11 @ 120knpm vs SF11 @ 360knpm. It appears that by eyeballing the doubling at 200 elo in that npm range I overestimated it somewhat. My data for SF from may is 180 elo from 130k to 260k and 130 elo from 260k to 520k.
Code: Select all
Score of Stockfish 11 at 120knpm vs Stockfish 11 at 360knpm: 37 - 643 - 320  [0.197] 1000
Elo difference: -244.10 +/- 19.05
My main point stands, though.

Raphexon wrote: ↑Fri Jan 17, 2020 5:44 pm So this is basically some empirical evidence that my claim of SF at 100 n/s = GM level was right.
With +-2400 being at the low end of GM level.
No, that's wrong.

The strength loss is dramatic when reducing node count that much. Just see my graph at the beginning, at such low node counts a doubling makes a massive difference. SF at 15K nodes per move is in the ballpark of 800 elo weaker than SF at 300K nodes per move.

Stockfish's development is focused on improving strength in the 200K+ nodes per move range, and advanced search tricks will never be able to help when you have only a few thousands nodes per move total. So I can predict that SF will never ever reach the GM level at 100n/s.
Don't translate 1:1 computer self-Elos to computer-Human Elos. My claim is that SF at 5 kn/s at tournament time control is at least 2700 FIDE Elo human level. That is if they didn't screw up something in the range of 200k - 1,000k nodes per move, but I doubt they did it, as 10 + 0.1 (one core) is the standard testing, and that is exactly in this range.

That is not too far out of line with my experience and with the Erenburg match, if we substitute Komodo at 10 kn/s for SF at 5 kn/s, which is close enough. At 60 kn/s (effectively) Komodo performed over 2900 FIDE vs Erenburg despite always playing Black and 3 move opening book, so probably about 3000 without these handicaps, which is reasonably consistent with 2700 at 10 kn/s. My experience with Revelation tells me that it is way stronger than I am at 10 kn/s, although I would not guess as high as 2700. But perhaps so with a good opening book.

Laskos · Post by **Laskos** » Sat Jan 18, 2020 12:26 am

lkaufman wrote: ↑Fri Jan 17, 2020 11:50 pm
Laskos wrote: ↑Fri Jan 17, 2020 11:19 pm
Alayan wrote: ↑Fri Jan 17, 2020 10:35 pm Thank you Larry for your interesting input.

To complete, my initial post, I ran SF11 @ 120knpm vs SF11 @ 360knpm. It appears that by eyeballing the doubling at 200 elo in that npm range I overestimated it somewhat. My data for SF from may is 180 elo from 130k to 260k and 130 elo from 260k to 520k.
Code: Select all
Score of Stockfish 11 at 120knpm vs Stockfish 11 at 360knpm: 37 - 643 - 320  [0.197] 1000
Elo difference: -244.10 +/- 19.05
My main point stands, though.

Raphexon wrote: ↑Fri Jan 17, 2020 5:44 pm So this is basically some empirical evidence that my claim of SF at 100 n/s = GM level was right.
With +-2400 being at the low end of GM level.
No, that's wrong.

The strength loss is dramatic when reducing node count that much. Just see my graph at the beginning, at such low node counts a doubling makes a massive difference. SF at 15K nodes per move is in the ballpark of 800 elo weaker than SF at 300K nodes per move.

Stockfish's development is focused on improving strength in the 200K+ nodes per move range, and advanced search tricks will never be able to help when you have only a few thousands nodes per move total. So I can predict that SF will never ever reach the GM level at 100n/s.
Don't translate 1:1 computer self-Elos to computer-Human Elos. My claim is that SF at 5 kn/s at tournament time control is at least 2700 FIDE Elo human level. That is if they didn't screw up something in the range of 200k - 1,000k nodes per move, but I doubt they did it, as 10 + 0.1 (one core) is the standard testing, and that is exactly in this range.
That is not too far out of line with my experience and with the Erenburg match, if we substitute Komodo at 10 kn/s for SF at 5 kn/s, which is close enough. At 60 kn/s (effectively) Komodo performed over 2900 FIDE vs Erenburg despite always playing Black and 3 move opening book, so probably about 3000 without these handicaps, which is reasonably consistent with 2700 at 10 kn/s. My experience with Revelation tells me that it is way stronger than I am at 10 kn/s, although I would not guess as high as 2700. But perhaps so with a good opening book.

Yes, and a factor of 6 in nodes at that 2700-3000 FIDE human Elo level is almost surely less than 300 human Elo points, probably about 200. Again that engine-engine and engine-human difference, which at top GM level can almost be a factor of 2 Elo-wise.

Our inferences basically came to probably very similar conclusions.

Raphexon · Post by **Raphexon** » Sun Jan 19, 2020 8:05 pm

Alayan wrote: ↑Fri Jan 17, 2020 10:35 pm Thank you Larry for your interesting input.

To complete, my initial post, I ran SF11 @ 120knpm vs SF11 @ 360knpm. It appears that by eyeballing the doubling at 200 elo in that npm range I overestimated it somewhat. My data for SF from may is 180 elo from 130k to 260k and 130 elo from 260k to 520k.
Code: Select all
Score of Stockfish 11 at 120knpm vs Stockfish 11 at 360knpm: 37 - 643 - 320  [0.197] 1000
Elo difference: -244.10 +/- 19.05
My main point stands, though.

Raphexon wrote: ↑Fri Jan 17, 2020 5:44 pm So this is basically some empirical evidence that my claim of SF at 100 n/s = GM level was right.
With +-2400 being at the low end of GM level.
No, that's wrong.

The strength loss is dramatic when reducing node count that much. Just see my graph at the beginning, at such low node counts a doubling makes a massive difference. SF at 15K nodes per move is in the ballpark of 800 elo weaker than SF at 300K nodes per move.

Stockfish's development is focused on improving strength in the 200K+ nodes per move range, and advanced search tricks will never be able to help when you have only a few thousands nodes per move total. So I can predict that SF will never ever reach the GM level at 100n/s.

Ok, I realized I didn't delete CB-emu (MAME) after all, it has a ton of old chessboards and most of those have seen plenty of human action in the past and I hope this ELO list is fairly accurate.

https://www.schach-computer.info/wiki/i ... -Elo-Liste

I'm going to do an exhaustive test at 18knodes to see which SF is the strongest at 18knodes. (assuming 100 nps and 180 sec per move)
Then I will start doing tests against gradually stronger chess computers.

What would you say is a fair opening suite to do the test with?
I'm planning on using an opening suite that's popular in the Leela discord (Chad's opening suite iirc) but not yet decided on the depth.
But I'm leaning to a 6 ply opening suite.
I also have a 6 ply opening suite of games between GMs. (GrandPQRS-3moves-2358) maybe that one is a better idea.

Also in the case SF isn't as strong as I imagined I do need to move the goalpost, so how much ELO do you think a proper TC would give over fixed nodes?

Anyway, I'm 100% serious.
I'll try to have my first little tournament in the next few days, or otherwise the upcoming weekend.

bob · Post by **bob** » Sun Jan 19, 2020 8:38 pm

Jeroen has a bunch of opening collections that should work just fine for such a test.

However, the thread has devolved into multiple different ideas. 100nps is a different thing from 100 nodes total. Just as is 2K or 5K total nodes much different from 2K to 5K nodes per second. All of the different measurements are introducing a lot of "fuzz" into the discussion.

I still don't believe that 1K or 5K total nodes per search is going to beat a GM at 3 minutes per move. I certainly remember that 25 or so years ago, searching 50-200K nodes per second was giving Grandmasters fits, even in long time controls. We played a 4xhuman vs 4xcomputer match years ago. Roman, Larry C, I don't recall the other two although for two of those games Karpov played in lieu of one of the other GMs not mentioned above. All 4 computers finished above all 4 GMs. Match was held on chess.net. I can't find any reference to it that includes games or players unfortunately. It was (to me) a surprising result. Including announcing a deep forced mate against Roman when Crafty started a series of forced trades that left the game in a won for Crafty KNN vs KP ending where we were using all the 3-4-5 piece Nalimov table bases. (match was a round robin, but not double games to swap colors for each opponent.)

Raphexon · Post by **Raphexon** » Sun Jan 19, 2020 8:42 pm

bob wrote: ↑Sun Jan 19, 2020 8:38 pm Jeroen has a bunch of opening collections that should work just fine for such a test.

However, the thread has devolved into multiple different ideas. 100nps is a different thing from 100 nodes total. Just as is 2K or 5K total nodes much different from 2K to 5K nodes per second. All of the different measurements are introducing a lot of "fuzz" into the discussion.

I still don't believe that 1K or 5K total nodes per search is going to beat a GM at 3 minutes per move. I certainly remember that 25 or so years ago, searching 50-200K nodes per second was giving Grandmasters fits, even in long time controls. We played a 4xhuman vs 4xcomputer match years ago. Roman, Larry C, I don't recall the other two although for two of those games Karpov played in lieu of one of the other GMs not mentioned above. All 4 computers finished above all 4 GMs. Match was held on chess.net. I can't find any reference to it that includes games or players unfortunately. It was (to me) a surprising result. Including announcing a deep forced mate against Roman when Crafty started a series of forced trades that left the game in a won for Crafty KNN vs KP ending where we were using all the 3-4-5 piece Nalimov table bases. (match was a round robin, but not double games to swap colors for each opponent.)

The 18000 nodes per move I picked is the adjusted 100 nps for 3 minute per move.

Nobody has said that 1k or 5k nodes per move would be able to beat a GM at 3 min/move (or 40/120), that's ridiculous, that'd be between 5 and 28 nps. Nobody thinks SF is that strong. The 1k or 5k numbers were n/s (nodes per second, not move)

Alayan · Post by **Alayan** » Sun Jan 19, 2020 11:13 pm

Raphexon wrote: ↑Sun Jan 19, 2020 8:05 pm I'm going to do an exhaustive test at 18knodes to see which SF is the strongest at 18knodes. (assuming 100 nps and 180 sec per move)
Then I will start doing tests against gradually stronger chess computers.

What would you say is a fair opening suite to do the test with?
I'm planning on using an opening suite that's popular in the Leela discord (Chad's opening suite iirc) but not yet decided on the depth.
But I'm leaning to a 6 ply opening suite.
I also have a 6 ply opening suite of games between GMs. (GrandPQRS-3moves-2358) maybe that one is a better idea.

Also in the case SF isn't as strong as I imagined I do need to move the goalpost, so how much ELO do you think a proper TC would give over fixed nodes?

Anyway, I'm 100% serious.
I'll try to have my first little tournament in the next few days, or otherwise the upcoming weekend.

Let us know about your results.

For the openings, you want a few thousands book exits, because fixed nodes is deterministic and repeated games hold no informative value, they pollute results.

I have not done a reliable measurement of the difference between proper TC/TM and fixed nodes, but I'd guess it to be not far from 100 elo at bullet, for a similar total time.

Ovyron · Post by **Ovyron** » Mon Jan 20, 2020 4:49 am

What is the meaning of "fixed nodes per move", though? Because when you force the engine to play like that, you're forcing it to abort the search on some random point of the current iteration, and then I'm not sure what's the best strategy to follow when choosing a move.

Suppose it's on the middle of the main move failing low, does it still play it and ignore the fail low? Or does it play instead a move that it knows it's currently better? (but that could be worse after resolving the fail low.) What if it's on the middle of an alternative move failing high? Does it still play the main move even though it currently has a score that is worse than the alternative? Or does it switch to the move failing high, even though after the fail high was resolved perhaps it finished worse than the other one?

What about ignoring all this and just playing the move that was known to be best at the end of the previous iteration? But this would waste nodes...

So I don't think engines are designed to abort the search and pick a move before the current iteration is over, and wonder if we'd get much better results if they were designed for fixed nodes per second in mind, and if it'd be more fruitful to test them at fixed depth.

Uri Blass · Post by **Uri Blass** » Mon Jan 20, 2020 6:53 am

Ovyron wrote: ↑Mon Jan 20, 2020 4:49 am What is the meaning of "fixed nodes per move", though? Because when you force the engine to play like that, you're forcing it to abort the search on some random point of the current iteration, and then I'm not sure what's the best strategy to follow when choosing a move.

Suppose it's on the middle of the main move failing low, does it still play it and ignore the fail low? Or does it play instead a move that it knows it's currently better? (but that could be worse after resolving the fail low.) What if it's on the middle of an alternative move failing high? Does it still play the main move even though it currently has a score that is worse than the alternative? Or does it switch to the move failing high, even though after the fail high was resolved perhaps it finished worse than the other one?

What about ignoring all this and just playing the move that was known to be best at the end of the previous iteration? But this would waste nodes...

So I don't think engines are designed to abort the search and pick a move before the current iteration is over, and wonder if we'd get much better results if they were designed for fixed nodes per second in mind, and if it'd be more fruitful to test them at fixed depth.

Engines are certainly not designed for fixed nodes per move but there are cases when the engine does not have enough time to solve some fail high and it certainly need to get some decision in this case that is optimal(assuming testing is done about it).

The question is also if fixed nodes per move has to be exactly 18K nodes per move or maybe the engine is allowed to use at most 18K nodes per move at move 1 when nodes that it did not use can be used in the next move so with 18K nodes per move the engine can use
at most 18k*n nodes for the first n moves but it can use less than 18K nodes for part of the moves in order to be allowed to use more than 18K nodes in another part of the moves.

Ovyron · Post by **Ovyron** » Mon Jan 20, 2020 7:25 am

Uri Blass wrote: ↑Mon Jan 20, 2020 6:53 am The question is also if fixed nodes per move has to be exactly 18K nodes per move or maybe the engine is allowed to use at most 18K nodes per move at move 1 when nodes that it did not use can be used in the next move so with 18K nodes per move the engine can use
at most 18k*n nodes for the first n moves but it can use less than 18K nodes for part of the moves in order to be allowed to use more than 18K nodes in another part of the moves.

Yes, but then why not just use some microbullet time control that averages 18K nodes per move? Just let the engine decide by itself if it needs less or if it needs more. I'd expect that an engine at averages 18K nodes per move would greatly outperform a fixed one.

Stockfish 11 at 120k nodes per move

Re: Stockfish 11 at 120k nodes per move

Re: Stockfish 11 at 120k nodes per move

Re: Stockfish 11 at 120k nodes per move

Re: Stockfish 11 at 120k nodes per move

Re: Stockfish 11 at 120k nodes per move

Re: Stockfish 11 at 120k nodes per move

Re: Stockfish 11 at 120k nodes per move

Re: Stockfish 11 at 120k nodes per move

Re: Stockfish 11 at 120k nodes per move

Re: Stockfish 11 at 120k nodes per move