Testing LazySMP

hgm · Post by **hgm** » Wed Oct 30, 2024 4:11 pm

No, I am talking about the entire tree. Looking at a single depth is misleading, because the number of moves you search directly affecs the number of nodes at the deeper levels. Per node you always only do a single make/unmake (in an engine search). Namely of the move leading to that node. And (in the absence of stand-pat, i.e. if futility pruning works perfectly) you also do a move generation. How many moves of that list you use doesn't affect this 1:1 ratio in the tree.

Ras · Post by **Ras** » Wed Oct 30, 2024 6:46 pm

If you examine only one move at a given depth level and then go up again, you get one node at that depth level. If you want two nodes to be counted, you need to go to another part of the tree and then run move gen again. If however you examine two nodes per depth level, then you get two nodes counting towards NPS out of one move generation. Calling move gen once is faster than calling it twice, hence more NPS.

hgm · Post by **hgm** » Wed Oct 30, 2024 8:09 pm

I am still not getting through to you.

You do your counting on a selected part of the tree, arbitrarily deciding which move generations and which makemoves you want to include in that part.

But if the ratio of makemoves and movegens in that part is not the same as in the entire tree, the chosen part is not representative for the tree, right? It is of course possible to select parts of the tree that are representative (e.g. some sub-tree). But none of those you discussed are. The only way to make sure the selection doesn't distort the stats is avoid making a selection.

So look at the entire tree. There are as many makemoves as nodes (minus 1), because every makemove leads to a node, and every node (except the root) is reached by one makemove.

Look at the code snippets:

for(SOME_MOVES) {
MakeMove()
Search()
UnMake()
}

Equally many calls to MakeMove, UnMake and Search. And Search calls MoveGen once, (and increments nodeCount once), so as many of those as well. Nothing in this analysis depends on how often you execute the loop (i.e. on the branching ratio).

Ras · Post by **Ras** » Wed Oct 30, 2024 9:29 pm

hgm wrote: ↑Wed Oct 30, 2024 8:09 pmSo look at the entire tree. There are as many makemoves as nodes (minus 1), because every makemove leads to a node, and every node (except the root) is reached by one makemove.

That's true, but make/unmake is move execution, not move generation. If you ever only examine one move on each depth level, then you have as many nodes as you have move generations. That ratio can change. I thought that by move generation and the move generator, we're speaking about the code that prepares a list of legal or pseudo-legal moves available in a node, but doesn't execute any of them.

There can be a lot of moves that are generated and examined, but never executed. In my quiescence, about 50% of the generated moves are somehow examined in a full-board early middlegame position. 30% of the generated moves are actually executed. Which also means that only 60% of the examined moves are executed. The other 40% are evaluated without making them - so not only don't they have make/unmake, they also don't lead to a new node with new move generation call, and they don't contribute to the NPS counter. Stuff like delta pruning, recapture only on same square after a certain quiescence depth, mini-SEE eliminate the need to do these moves.

LazySMP · Post by **LazySMP** » Thu Oct 31, 2024 2:44 am

LazySMP perft speed is about 70MNPS with do/undo the move at the last ply.

Code: Select all

Perft result for depth: 6  119060324 Time in ms: 1697  Nodes per sec: 70159295

LazySMP perft speed is about 600MNPS without do/undo the move at the last ply.

Code: Select all

Perft result for depth: 6  119060324 Time in ms: 201  Nodes per sec: 592339920

Do these results convince you that my engine is the fastest chess move generator in the world?

hgm · Post by **hgm** » Thu Oct 31, 2024 7:42 am

Ras wrote: ↑Wed Oct 30, 2024 9:29 pm That's true, but make/unmake is move execution, not move generation. If you ever only examine one move on each depth level, then you have as many nodes as you have move generations. That ratio can change. I thought that by move generation and the move generator, we're speaking about the code that prepares a list of legal or pseudo-legal moves available in a node, but doesn't execute any of them.

There can be a lot of moves that are generated and examined, but never executed. In my quiescence, about 50% of the generated moves are somehow examined in a full-board early middlegame position. 30% of the generated moves are actually executed. Which also means that only 60% of the examined moves are executed. The other 40% are evaluated without making them - so not only don't they have make/unmake, they also don't lead to a new node with new move generation call, and they don't contribute to the NPS counter. Stuff like delta pruning, recapture only on same square after a certain quiescence depth, mini-SEE eliminate the need to do these moves.

With move generation I am indeed referring to the preparation of the list of (pseudo-)legal moves.

You first paragraph contradicts your second. If you want to make a distinction between examining and executing, then examining would not produce a new node. And you are still using the meaningless concept of 'depth level', which pretty much would invalidate anything you say anyway. If at every of your 'depth levels' 20 moves would be executed, instead of 1, there would still be as many move generations as makemoves. How many moves you 'examine' without executing neither affects the number of move generations nor the number of executions.

And yes, engines do other things than move making and generation: sort-key-calculation, move sorting, collection of tree statistics, evaluation.... None of which are done by perft. So I don't see the relevance for this discussion. If your engine is slow because it uses a huge NN for evaluation, or subjects too many moves to a slow SEE, it does in no way refute the claim that it has fast move generation.

None of this has any bearing on the fact that a perft that does bulk counting weights the contribution of generation and move making the same as an engine, while perfts that make/unmake the last ply overweight make/unmake by a factor 35 or so. And thus appear unrealistically fast, as making/unmaking a single move is usually an order of magnitude faster than generating all moves.

Brunetti · Post by **Brunetti** » Thu Oct 31, 2024 9:30 am

LazySMP wrote: ↑Thu Oct 31, 2024 2:44 am Do these results convince you that my engine is the fastest chess move generator in the world?

No. This is much faster.

Alex

Ras · Post by **Ras** » Thu Oct 31, 2024 11:06 am

LazySMP wrote: ↑Thu Oct 31, 2024 2:44 amLazySMP perft speed is about 70MNPS with do/undo the move at the last ply.
LazySMP perft speed is about 600MNPS without do/undo the move at the last ply.

On my HW, even Stockfish gets "only" around 290MNPS, so yours is about twice as fast. However, do you already use MVV/LVA in move generation, evaluation of history, and update Zobrist keys in make/unmake? Mobility detection (though with NNUE, that would probably be in the eval instead)?

Do these results convince you that my engine is the fastest chess move generator in the world?

It's impressive, no doubt, but there have been faster ones than that.

Ras · Post by **Ras** » Thu Oct 31, 2024 11:24 am

hgm wrote: ↑Thu Oct 31, 2024 7:42 amYou first paragraph contradicts your second.

I disagree.

If you want to make a distinction between examining and executing, then examining would not produce a new node.

Examining, as I put it, can be through executing, producing a new node, or without executing, not producing a new node.

And you are still using the meaningless concept of 'depth level'

It's not meaningless because re-using an existing move list is obviously less work. It doesn't seem like we'll reach a consensus here, so I'll leave it at that.

How many moves you 'examine' without executing neither affects the number of move generations nor the number of executions.

It does, however, impact the measured NPS. That's the point. It involves activities, i.e. costs computing time, without increasing the NPS counter. While it saves search time, it drags down NPS.

And yes, engines do other things than move making and generation: sort-key-calculation, move sorting, collection of tree statistics, evaluation.... None of which are done by perft.

Ofc they are done in perft becvause the whole point of perft is debugging and benchmarking the very code that the engine is using. Having a dedicated perft move generator and make/unmake largely defeats the point of even having perft in the engine.

None of this has any bearing on the fact that a perft that does bulk counting weights the contribution of generation and move making the same as an engine, while perfts that make/unmake the last ply overweight make/unmake by a factor 35 or so. And thus appear unrealistically fast, as making/unmaking a single move is usually an order of magnitude faster than generating all moves.

Bulk counting eliminates make/unmake at the leaves. Not doing work is always much faster than doing work. That's why bulk counting is faster.

LazySMP · Post by **LazySMP** » Thu Oct 31, 2024 9:59 pm

Brunetti wrote: ↑Thu Oct 31, 2024 9:30 am No. This is much faster.

This project is only designed for perft and I don't think it can be used for chess engine. In addition, the CPU used is overclocked.

Ras wrote: ↑Thu Oct 31, 2024 11:06 am However, do you already use MVV/LVA in move generation, evaluation of history, and update Zobrist keys in make/unmake?

Yes, I use Zobrist keys in make/unmake move.

hgm wrote: ↑Thu Oct 31, 2024 7:42 am None of this has any bearing on the fact that a perft that does bulk counting weights the contribution of generation and move making the same as an engine, while perfts that make/unmake the last ply overweight make/unmake by a factor 35 or so. And thus appear unrealistically fast, as making/unmaking a single move is usually an order of magnitude faster than generating all moves.

I think that these discussions may be fruitless because the speed of generating moves has an inverse relationship with the speed of making moves. Engines that have a faster move generator have a low speed of make/unmake move, and engines that have a high speed of make/unmake move, have a low speed of move generator.

Testing LazySMP

Re: New engine: LazySMP

Re: New engine: LazySMP

Re: New engine: LazySMP

Re: New engine: LazySMP

Re: New engine: LazySMP

Re: New engine: LazySMP

Re: New engine: LazySMP

Re: New engine: LazySMP

Re: New engine: LazySMP

Re: New engine: LazySMP