Testing LazySMP
Moderator: Ras
-
- Posts: 28342
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: New engine: LazySMP
No, I am talking about the entire tree. Looking at a single depth is misleading, because the number of moves you search directly affecs the number of nodes at the deeper levels. Per node you always only do a single make/unmake (in an engine search). Namely of the move leading to that node. And (in the absence of stand-pat, i.e. if futility pruning works perfectly) you also do a move generation. How many moves of that list you use doesn't affect this 1:1 ratio in the tree.
-
- Posts: 2695
- Joined: Tue Aug 30, 2016 8:19 pm
- Full name: Rasmus Althoff
Re: New engine: LazySMP
If you examine only one move at a given depth level and then go up again, you get one node at that depth level. If you want two nodes to be counted, you need to go to another part of the tree and then run move gen again. If however you examine two nodes per depth level, then you get two nodes counting towards NPS out of one move generation. Calling move gen once is faster than calling it twice, hence more NPS.
Rasmus Althoff
https://www.ct800.net
https://www.ct800.net
-
- Posts: 28342
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: New engine: LazySMP
I am still not getting through to you.
You do your counting on a selected part of the tree, arbitrarily deciding which move generations and which makemoves you want to include in that part.
But if the ratio of makemoves and movegens in that part is not the same as in the entire tree, the chosen part is not representative for the tree, right? It is of course possible to select parts of the tree that are representative (e.g. some sub-tree). But none of those you discussed are. The only way to make sure the selection doesn't distort the stats is avoid making a selection.
So look at the entire tree. There are as many makemoves as nodes (minus 1), because every makemove leads to a node, and every node (except the root) is reached by one makemove.
Look at the code snippets:
for(SOME_MOVES) {
MakeMove()
Search()
UnMake()
}
Equally many calls to MakeMove, UnMake and Search. And Search calls MoveGen once, (and increments nodeCount once), so as many of those as well. Nothing in this analysis depends on how often you execute the loop (i.e. on the branching ratio).
You do your counting on a selected part of the tree, arbitrarily deciding which move generations and which makemoves you want to include in that part.
But if the ratio of makemoves and movegens in that part is not the same as in the entire tree, the chosen part is not representative for the tree, right? It is of course possible to select parts of the tree that are representative (e.g. some sub-tree). But none of those you discussed are. The only way to make sure the selection doesn't distort the stats is avoid making a selection.
So look at the entire tree. There are as many makemoves as nodes (minus 1), because every makemove leads to a node, and every node (except the root) is reached by one makemove.
Look at the code snippets:
for(SOME_MOVES) {
MakeMove()
Search()
UnMake()
}
Equally many calls to MakeMove, UnMake and Search. And Search calls MoveGen once, (and increments nodeCount once), so as many of those as well. Nothing in this analysis depends on how often you execute the loop (i.e. on the branching ratio).
-
- Posts: 2695
- Joined: Tue Aug 30, 2016 8:19 pm
- Full name: Rasmus Althoff
Re: New engine: LazySMP
That's true, but make/unmake is move execution, not move generation. If you ever only examine one move on each depth level, then you have as many nodes as you have move generations. That ratio can change. I thought that by move generation and the move generator, we're speaking about the code that prepares a list of legal or pseudo-legal moves available in a node, but doesn't execute any of them.
There can be a lot of moves that are generated and examined, but never executed. In my quiescence, about 50% of the generated moves are somehow examined in a full-board early middlegame position. 30% of the generated moves are actually executed. Which also means that only 60% of the examined moves are executed. The other 40% are evaluated without making them - so not only don't they have make/unmake, they also don't lead to a new node with new move generation call, and they don't contribute to the NPS counter. Stuff like delta pruning, recapture only on same square after a certain quiescence depth, mini-SEE eliminate the need to do these moves.
Rasmus Althoff
https://www.ct800.net
https://www.ct800.net
Re: New engine: LazySMP
LazySMP perft speed is about 70MNPS with do/undo the move at the last ply.
LazySMP perft speed is about 600MNPS without do/undo the move at the last ply.
Do these results convince you that my engine is the fastest chess move generator in the world?
Code: Select all
Perft result for depth: 6 119060324 Time in ms: 1697 Nodes per sec: 70159295
Code: Select all
Perft result for depth: 6 119060324 Time in ms: 201 Nodes per sec: 592339920
-
- Posts: 28342
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: New engine: LazySMP
With move generation I am indeed referring to the preparation of the list of (pseudo-)legal moves.Ras wrote: ↑Wed Oct 30, 2024 9:29 pm That's true, but make/unmake is move execution, not move generation. If you ever only examine one move on each depth level, then you have as many nodes as you have move generations. That ratio can change. I thought that by move generation and the move generator, we're speaking about the code that prepares a list of legal or pseudo-legal moves available in a node, but doesn't execute any of them.
There can be a lot of moves that are generated and examined, but never executed. In my quiescence, about 50% of the generated moves are somehow examined in a full-board early middlegame position. 30% of the generated moves are actually executed. Which also means that only 60% of the examined moves are executed. The other 40% are evaluated without making them - so not only don't they have make/unmake, they also don't lead to a new node with new move generation call, and they don't contribute to the NPS counter. Stuff like delta pruning, recapture only on same square after a certain quiescence depth, mini-SEE eliminate the need to do these moves.
You first paragraph contradicts your second. If you want to make a distinction between examining and executing, then examining would not produce a new node. And you are still using the meaningless concept of 'depth level', which pretty much would invalidate anything you say anyway. If at every of your 'depth levels' 20 moves would be executed, instead of 1, there would still be as many move generations as makemoves. How many moves you 'examine' without executing neither affects the number of move generations nor the number of executions.
And yes, engines do other things than move making and generation: sort-key-calculation, move sorting, collection of tree statistics, evaluation.... None of which are done by perft. So I don't see the relevance for this discussion. If your engine is slow because it uses a huge NN for evaluation, or subjects too many moves to a slow SEE, it does in no way refute the claim that it has fast move generation.
None of this has any bearing on the fact that a perft that does bulk counting weights the contribution of generation and move making the same as an engine, while perfts that make/unmake the last ply overweight make/unmake by a factor 35 or so. And thus appear unrealistically fast, as making/unmaking a single move is usually an order of magnitude faster than generating all moves.
-
- Posts: 424
- Joined: Tue Dec 08, 2009 1:37 pm
- Location: Milan, Italy
- Full name: Alex Brunetti
-
- Posts: 2695
- Joined: Tue Aug 30, 2016 8:19 pm
- Full name: Rasmus Althoff
Re: New engine: LazySMP
On my HW, even Stockfish gets "only" around 290MNPS, so yours is about twice as fast. However, do you already use MVV/LVA in move generation, evaluation of history, and update Zobrist keys in make/unmake? Mobility detection (though with NNUE, that would probably be in the eval instead)?
It's impressive, no doubt, but there have been faster ones than that.Do these results convince you that my engine is the fastest chess move generator in the world?
Rasmus Althoff
https://www.ct800.net
https://www.ct800.net
-
- Posts: 2695
- Joined: Tue Aug 30, 2016 8:19 pm
- Full name: Rasmus Althoff
Re: New engine: LazySMP
I disagree.
Examining, as I put it, can be through executing, producing a new node, or without executing, not producing a new node.If you want to make a distinction between examining and executing, then examining would not produce a new node.
It's not meaningless because re-using an existing move list is obviously less work. It doesn't seem like we'll reach a consensus here, so I'll leave it at that.And you are still using the meaningless concept of 'depth level'
It does, however, impact the measured NPS. That's the point. It involves activities, i.e. costs computing time, without increasing the NPS counter. While it saves search time, it drags down NPS.How many moves you 'examine' without executing neither affects the number of move generations nor the number of executions.
Ofc they are done in perft becvause the whole point of perft is debugging and benchmarking the very code that the engine is using. Having a dedicated perft move generator and make/unmake largely defeats the point of even having perft in the engine.And yes, engines do other things than move making and generation: sort-key-calculation, move sorting, collection of tree statistics, evaluation.... None of which are done by perft.
Bulk counting eliminates make/unmake at the leaves. Not doing work is always much faster than doing work. That's why bulk counting is faster.None of this has any bearing on the fact that a perft that does bulk counting weights the contribution of generation and move making the same as an engine, while perfts that make/unmake the last ply overweight make/unmake by a factor 35 or so. And thus appear unrealistically fast, as making/unmaking a single move is usually an order of magnitude faster than generating all moves.
Rasmus Althoff
https://www.ct800.net
https://www.ct800.net
Re: New engine: LazySMP
This project is only designed for perft and I don't think it can be used for chess engine. In addition, the CPU used is overclocked.
Yes, I use Zobrist keys in make/unmake move.
I think that these discussions may be fruitless because the speed of generating moves has an inverse relationship with the speed of making moves. Engines that have a faster move generator have a low speed of make/unmake move, and engines that have a high speed of make/unmake move, have a low speed of move generator.hgm wrote: ↑Thu Oct 31, 2024 7:42 am None of this has any bearing on the fact that a perft that does bulk counting weights the contribution of generation and move making the same as an engine, while perfts that make/unmake the last ply overweight make/unmake by a factor 35 or so. And thus appear unrealistically fast, as making/unmaking a single move is usually an order of magnitude faster than generating all moves.