NNOM++ - Move Ordering Neural Networks?

Discussion of chess software programming and technical issues.

Moderator: Ras

dangi12012
Posts: 1062
Joined: Tue Apr 28, 2020 10:03 pm
Full name: Daniel Infuehr

Re: NNOM++ - Move Ordering Neural Networks?

Post by dangi12012 »

To be more accurate: "interestingness" in this context means ordering moves such that they are likely to force alpha–beta cutoffs.
Here is the important part:

If your algorithm contains search extensions and pruning techniques - it does not matter since the network is trained (with your set of extensions) to maximize additional cutoffs.
Worlds-fastest-Bitboard-Chess-Movegenerator
Daniel Inführ - Software Developer
User avatar
j.t.
Posts: 263
Joined: Wed Jun 16, 2021 2:08 am
Location: Berlin
Full name: Jost Triller

Re: NNOM++ - Move Ordering Neural Networks?

Post by j.t. »

dangi12012 wrote: Mon Jul 25, 2022 8:19 pm To be more accurate: "interestingness" in this context means ordering moves such that they are likely to force alpha–beta cutoffs.
But isn't that pretty much the same as the evaluation of how good a position is? The best chance to get an early cutoff is to search the best move first. So, I am not sure, I understand you correctly by what you mean with the difference between "interestingness" and "eval".
PK
Posts: 905
Joined: Mon Jan 15, 2007 11:23 am
Location: Warsza

Re: NNOM++ - Move Ordering Neural Networks?

Post by PK »

But isn't that pretty much the same as the evaluation of how good a position is? The best chance to get an early cutoff is to search the best move first.


No. The best chance is to search a move that causes a cutoff while searching the least amount of nodes. It does not matter how large is the margin of a cutoff. Statistically, it pays off to get a cutoff playing a capture or a forcing move, since it reduces the size of a search tree. The question is whether a network that reduces the size of an alpha-beta search tree would be different from a network that works within a MCTS search.

Actually, from my experience with a MCTS-based bot playing a game of go, MCTS likes forcing moves even when they are pretty bad. A sequence of a forcing move - a forced reply - a non-forcing move can randomly score more than the same non-forcing move played in the first place.
dangi12012
Posts: 1062
Joined: Tue Apr 28, 2020 10:03 pm
Full name: Daniel Infuehr

Re: NNOM++ - Move Ordering Neural Networks?

Post by dangi12012 »

One would think that you have to search the best move first -
But the separation of eval and policy networks yields more strength.

Having a network that picks moves so that the AB cutoff is more likely - and one that evaluates a silent position.

I wonder if that's optimal and maybe we need machine learning (or trial and error) for the topology as well. Maybe also an aggressive pruning network that just guesses what can be pruned with all inputs that the normal pruning heuristics use
Worlds-fastest-Bitboard-Chess-Movegenerator
Daniel Inführ - Software Developer
smatovic
Posts: 3254
Joined: Wed Mar 10, 2010 10:18 pm
Location: Hamburg, Germany
Full name: Srdja Matovic

Re: NNOM++ - Move Ordering Neural Networks?

Post by smatovic »

I see here three branches:

- Lc0 style network with two heads, policy as predicator and head for evaluation.
- Distinct neural network for move ordering to increase beta-cutoffs and neural network for pruning/reduction heuristic.
- Same neural network for evaluation, move ordering and pruning/reduction heuristic, distinct by quiet/non-quiet position/moves.

As I said, the last one was my naive approach, idk if this is feasible, but I would prefer that one considering the clock cycles you have to invest for NN inference.

--
Srdja
dangi12012
Posts: 1062
Joined: Tue Apr 28, 2020 10:03 pm
Full name: Daniel Infuehr

Re: NNOM++ - Move Ordering Neural Networks?

Post by dangi12012 »

smatovic wrote: Wed Jul 27, 2022 3:20 pm - Same neural network for evaluation, move ordering and pruning/reduction heuristic, distinct by quiet/non-quiet position/moves.
Double the size - quadruple the work. So its best to have specialized networks that are very small to be effective per cycle.
Next step is removing weights that are very close to zero. Makes sparse matrix calculus possible which can be faster yet again.

If you are interested PM me - I am deep in that field currently. Its all just general matrix multiply - which makes it so appealing (fast and learns all the nuances that a human will miss when writing heuristics manually)
Worlds-fastest-Bitboard-Chess-Movegenerator
Daniel Inführ - Software Developer