Board adaptive / tuning evaluation function - no NN/AI

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

Tony P.
Posts: 216
Joined: Sun Jan 22, 2017 8:30 pm
Location: Russia

Re: Board adaptive / tuning evaluation function - no NN/AI

Post by Tony P. »

YUFe wrote: Wed Jan 15, 2020 2:07 pm Has anyone used a NN for hashing or board state representation/encoding ?
It has been done in MuZero*. I like the approach a lot and think Mu0 would grow a lot stronger than Alpha0 if Deepmind hadn't switched it from chess to tasks that interested the team more.

Another piece on this topic (albeit on reinforcement learning in general, not chess) that I like is Sourabh Bose's PhD dissertation.

* That said, I do believe that a chess board representation could use far fewer parameters and still be accurate enough.
Tony P.
Posts: 216
Joined: Sun Jan 22, 2017 8:30 pm
Location: Russia

Re: Board adaptive / tuning evaluation function - no NN/AI

Post by Tony P. »

OK, there's no need to read that dissertation - there's so much literature available that I can't be sure which method is optimal, need to use RL techniques to learn to explore the space of RL papers more efficiently :lol:

Here's a possibly more interesting paper: Measuring Structural Similarities in Finite MDPs. If applied to chess, it would produce a metric that would treat positions with more similar sets of (pseudo)legal moves (identified e.g. by the to- and from- squares and the piece types of promotions) as closer to each other than those with less similar sets of moves.
YUFe
Posts: 17
Joined: Sat Jan 11, 2020 3:52 pm
Full name: Moritz Gedig

Re: Board adaptive / tuning evaluation function - no NN/AI

Post by YUFe »

Tony P. wrote: Wed Jan 15, 2020 5:28 pm It has been done in MuZero ... I do believe that a chess board representation could use far fewer parameters
I researched the "representation" part. I was just talking about an Autoencoder for the position. They represent the entire game, not just the current state.
I did not know of MuZero, but then I never liked the Atari 2600.
Tony P.
Posts: 216
Joined: Sun Jan 22, 2017 8:30 pm
Location: Russia

Re: Board adaptive / tuning evaluation function - no NN/AI

Post by Tony P. »

YUFe wrote: Thu Jan 16, 2020 10:00 am They represent the entire game, not just the current state.
Mu0 was tested vs A0 with a search budget of 800 (lol) nodes per decision. I doubt Mu0 would be competent at large search depths without modifications, as it only encodes the root and then searches in the latent space.

However, if it were modified to roll out a number of predicted 6-ply sequences on an actual board, then encode the resulting positions before searching in their subtrees in the latent space, and repeat this procedure 6, 12, etc. plies away from the root, then it would have to encode somewhat fewer nodes than if it encoded every position in the tree, though the savings wouldn't be that big because there'd be relatively fewer TT hits.

As Mu0's value prediction network would be very deep and have a ton of parameters anyway, the addition of policy prediction heads to the output layer didn't increase the computational cost per latent state by much, so it made sense for DeepMind to add those heads to guide the search.

On the other hand, in an algorithm with a lightweight value prediction function, it indeed likely wouldn't make sense to call a separate policy predictor instead of guiding the search with some simple move ordering heuristic and then by the value predictions for the children. Then (as far as I understand, that's what you had in mind) an encoder would only be used to map a discrete board state into a lower dimension real-valued vector that would be easier to put into a kernel/NN/etc. to predict the value, and would also make the training easier.

I'm not an expert, so take my ramblings with a grain of salt :mrgreen:
Tony P.
Posts: 216
Joined: Sun Jan 22, 2017 8:30 pm
Location: Russia

Re: Board adaptive / tuning evaluation function - no NN/AI

Post by Tony P. »

I might end up trying some method from this survey of knowledge graph embeddings/projections (treating positions as KGs) or whichever survey will be the most up-to-date then, but I expect a lot of trial and error :( as the trade-off between eval accuracy and speed seems dramatic in chess.
YUFe
Posts: 17
Joined: Sat Jan 11, 2020 3:52 pm
Full name: Moritz Gedig

Re: Board adaptive / tuning evaluation function - no NN/AI

Post by YUFe »

Tony P. wrote: Thu Jan 16, 2020 10:35 pm I might ...
I appreciate your enthusiasm, but you need to make your own thread and detail it there.
YUFe
Posts: 17
Joined: Sat Jan 11, 2020 3:52 pm
Full name: Moritz Gedig

More details

Post by YUFe »

After the search those moves that were investigated the deepest have reliable new evaluations. Thus the available moves now have their original evaluation and a new evaluation identical to that of a much later/deeper move.
We do know which of the myriad of possible following states gave us that value.
After each search we got pairs of states of approximately the same value even though the eval() does yield different values. We know that our eval() is wrong in a particular direction. We might have a move that was previously evaluated as better than that which turned out as better after the search and it is likely that we have a good new estimate for it. Now we kind of just have to figure out what changed. Also we can figure out what factors did not have an impact or varying, oscillating impact with dept. Because all root move result states can not be too different it should not be hard to figure out what made the difference.

We can make the move and store the entire tree that belongs to it or change eval() based on that tree. Both the change to the eval() just like the stored tree carry over information from the last search.
PK
Posts: 893
Joined: Mon Jan 15, 2007 11:23 am
Location: Warsza

Re: Board adaptive / tuning evaluation function - no NN/AI

Post by PK »

Rodent IV adapts to position in a sense: it uses two competing piece/square table scores. If they are equal, they are weighted 50/50. If one is better, ratio between their weights is pulled towards 25/75 (function of square root of difference between the scores, capped at 25).
YUFe
Posts: 17
Joined: Sat Jan 11, 2020 3:52 pm
Full name: Moritz Gedig

Re: Board adaptive / tuning evaluation function - no NN/AI

Post by YUFe »

PK wrote: Tue Jan 21, 2020 12:47 pmIf one is better
Please specify.
Is that a systematic form of extremizing as in partial information aggregation for Forecasts/Predictions ?
DustyMonkey
Posts: 61
Joined: Wed Feb 19, 2014 10:11 pm

Re: Board adaptive / tuning evaluation function - no NN/AI

Post by DustyMonkey »

YUFe wrote: Wed Jan 15, 2020 2:07 pm That brings me totally off topic, I have some other ideas:
Has anyone used a NN for hashing or board state representation/encoding ?
One could train a net with few inner nodes/neurons (e.g. <25) to turn a representation with many (e.g. 385) neurons into that inner representation and back to the input. That looks useless at first but would provide a "notation" that is meaningful and likely allows to judge how similar two states are. Using a normal hash function on that would make sure that things similar become dissimilar.
Re: NN for hashing/board representation

The search term you need is "autoencoder" - but there will be the issue of what you will want to be considered 'typical' positions - an autoencoder cannot reduce the state much beyond a well packed traditional encoding _unless_ you decide that some 'possible' positions are either more or less important.

An NN may be overkill however. A simpler k-means clustering attack (with large k) will be much much quicker to 'train' and give you the same sort of 'similarity' groupings.