Board adaptive / tuning evaluation function - no NN/AI

Tony P. · Post by **Tony P.** » Thu Jan 16, 2020 10:35 pm

I might end up trying some method from this survey of knowledge graph embeddings/projections (treating positions as KGs) or whichever survey will be the most up-to-date then, but I expect a lot of trial and error

as the trade-off between eval accuracy and speed seems dramatic in chess.

YUFe · Post by **YUFe** » Fri Jan 17, 2020 9:37 am

Tony P. wrote: ↑Thu Jan 16, 2020 10:35 pm I might ...

I appreciate your enthusiasm, but you need to make your own thread and detail it there.

YUFe · Post by **YUFe** » Sun Jan 19, 2020 11:19 am

After the search those moves that were investigated the deepest have reliable new evaluations. Thus the available moves now have their original evaluation and a new evaluation identical to that of a much later/deeper move.
We do know which of the myriad of possible following states gave us that value.
After each search we got pairs of states of approximately the same value even though the eval() does yield different values. We know that our eval() is wrong in a particular direction. We might have a move that was previously evaluated as better than that which turned out as better after the search and it is likely that we have a good new estimate for it. Now we kind of just have to figure out what changed. Also we can figure out what factors did not have an impact or varying, oscillating impact with dept. Because all root move result states can not be too different it should not be hard to figure out what made the difference.

We can make the move and store the entire tree that belongs to it or change eval() based on that tree. Both the change to the eval() just like the stored tree carry over information from the last search.

PK · Post by PK » Tue Jan 21, 2020 12:47 pm

Rodent IV adapts to position in a sense: it uses two competing piece/square table scores. If they are equal, they are weighted 50/50. If one is better, ratio between their weights is pulled towards 25/75 (function of square root of difference between the scores, capped at 25).

YUFe · Post by **YUFe** » Tue Jan 21, 2020 2:25 pm

PK wrote: ↑Tue Jan 21, 2020 12:47 pmIf one is better

Please specify.
Is that a systematic form of extremizing as in partial information aggregation for Forecasts/Predictions ?

DustyMonkey · Post by **DustyMonkey** » Wed Jan 22, 2020 4:43 am

YUFe wrote: ↑Wed Jan 15, 2020 2:07 pm That brings me totally off topic, I have some other ideas:
Has anyone used a NN for hashing or board state representation/encoding ?
One could train a net with few inner nodes/neurons (e.g. <25) to turn a representation with many (e.g. 385) neurons into that inner representation and back to the input. That looks useless at first but would provide a "notation" that is meaningful and likely allows to judge how similar two states are. Using a normal hash function on that would make sure that things similar become dissimilar.

Re: NN for hashing/board representation

The search term you need is "autoencoder" - but there will be the issue of what you will want to be considered 'typical' positions - an autoencoder cannot reduce the state much beyond a well packed traditional encoding _unless_ you decide that some 'possible' positions are either more or less important.

An NN may be overkill however. A simpler k-means clustering attack (with large k) will be much much quicker to 'train' and give you the same sort of 'similarity' groupings.

YUFe · Post by **YUFe** » Wed Jan 22, 2020 10:58 am

DustyMonkey wrote: ↑Wed Jan 22, 2020 4:43 am The search term you need is "autoencoder"

Yes, thank you. This time I actually knew that.

but there will be the issue of what you will want to be considered 'typical' positions - an autoencoder cannot reduce the state much beyond a well packed traditional encoding _unless_ you decide that some 'possible' positions are either more or less important.

True, and I had actually calculated the 25 to be in that region. I was more concerned with having too many inner nodes so the NN had too many choices of encoding.

An NN may be overkill however. A simpler k-means clustering attack (with large k) will be much much quicker to 'train' and give you the same sort of 'similarity' groupings.

I doubt that would work, because there ares too many dimensions per data point.

YUFe · Post by **YUFe** » Wed Jan 22, 2020 5:35 pm

DustyMonkey wrote: ↑Wed Jan 22, 2020 4:43 aman autoencoder cannot reduce the state much beyond a well packed traditional encoding

My guesstimate is that it takes ~175 bit but certainly <185 bit to encode any position. That is way too little to encode any meaning with it.
How do you intend to represent the board as a cardinal, metric scale, as it would be needed for clustering?
For a single board setting:
k be the number of clusters
d be the number of dimensions
3 be the points per cluster
6 bit the precision of any point
k * 3 * d * 6 bit = 185 bit
See my point? k * d would be too small.
The autoencoder would have to produce way more than 175 bit to extract and encode meaningful features.

YUFe · Post by **YUFe** » Thu Jan 23, 2020 10:55 am

Tony P. wrote: ↑Thu Jan 16, 2020 9:07 pmThen (as far as I understand, that's what you had in mind) an encoder would only be used to map a discrete board state into a lower dimension real-valued vector that would be easier to put into a kernel/NN/etc. to predict the value, and would also make the training easier.

That is right. It is a more meaningful representation and fit as input to statistical / geometric analysis.

DustyMonkey · Post by **DustyMonkey** » Mon Jan 27, 2020 5:14 pm

YUFe wrote: ↑Wed Jan 22, 2020 5:35 pm
DustyMonkey wrote: ↑Wed Jan 22, 2020 4:43 aman autoencoder cannot reduce the state much beyond a well packed traditional encoding
My guesstimate is that it takes ~175 bit but certainly <185 bit to encode any position. That is way too little to encode any meaning with it.

I think you are missing the point of autoencoding methods. It is wholly unimportant as to how many legal positions there are.

Only the training/input set are important, and the possible size of that set will not significantly exceed 2^30 in an applied scenario.

Data compression in the general case works because not all possibilities are meaningfully important. Only the given data is important.

In the case of an autoencoder, which is a type of a lossy compression model, features such as a knight on a corner square will be the ones that fail to be encoded accurately. Sure, the inner nodes cannot represent all possible inputs, and maybe not even represent all of the training set, but it is the rare "exceptions to the rule" that will fail to be represented.

In the case of non-lossy forms of data compression, encoding a models failure costs more than encoding a models success, and compression happens because failures are rarer than successes.

In this case we are simply dispensing with caring about the failures... the exceptions to the model.

Board adaptive / tuning evaluation function - no NN/AI

Re: Board adaptive / tuning evaluation function - no NN/AI

Re: Board adaptive / tuning evaluation function - no NN/AI

More details

Re: Board adaptive / tuning evaluation function - no NN/AI

Re: Board adaptive / tuning evaluation function - no NN/AI

Re: Board adaptive / tuning evaluation function - no NN/AI

Re: Board adaptive / tuning evaluation function - no NN/AI

Representation in metric Space

Re: Board adaptive / tuning evaluation function - no NN/AI

Re: Representation in metric Space