There Might Exist Simple Rules For Accurate Position Evaluation

towforce · Post by **towforce** » Sat Oct 19, 2019 1:53 am

In simple deterministic games, it would be easy to find simple rules for accurately evaluating a position.

In a badly designed complex game, it can be easy to find simple rules for accurately evaluating a position (about 15 years ago, there was a version of FIFA which was actually very easy to beat by following a simple rule). There is also the problem that adding complexity to a game can accidentally introduce a simple way to beat it (not a game example, but the builders of the WWII Enigma encryption device accidentally made it MUCH easier to crack by adding a reflection wheel with the intention of making it harder to crack).

Chess is generally a well-designed game, in that there isn't an obvious way to do a simple evaluation which works well in most positions (at least nobody has found one), but a relatively simple evaluation that gives accurate results in most positions might well exist.

The next question would be, how to search for such an evaluation?

A couple of suggestions:

1. AlphaZero plays a good game using a relatively small game tree (though still a very big one by human player standards, unfortunately). Find a way to output its NN as a mathematical expression, and then use a CAS (or some other way) to simplify that expression to a small one that can be executed quickly

2. Codify a set of linear relationships between the pieces, then use linear programming algorithms to generate expressions for these from a correctly evaluated set of samples. If doing this at the surface level isn't good enough, iteratively deepen it by adding relationships between relationships using the method of divided differences to keep it linear (in order to be able to find optimal solutions quickly).

Ovyron · Post by **Ovyron** » Sat Oct 19, 2019 6:05 am

Another option is to do it "top-bottom", first you get a bunch of positions with some "accurate" evaluation that you want to emulate. Say, Stockfish at Depth 40's final eval of the positions (just the eval, we don't care about the move chosen). Then you try to extract the "essence" of the positions, building a program that is able to look at those positions and return the D40's evals for all of them. See the difference between what this algorithm shows for new positions and what D40 shows for them and so on.

Once you have something decent at this (which comes up with those evals almost instantly as opposed to minutes) you can do normal A/B with this eval and it ought be be something because at Depth 0 it'd return what current approaches do much later, and hopefully those evaluations will rely on simple rules.

Zenmastur · Post by **Zenmastur** » Sat Oct 19, 2019 6:16 am

I think starting with a bit simpler task would be in order.

I have long been of the opinion that the cache replacement policies used in current AB engines is not only too simplistic but is far less than optimal.

HGM is of the opinion that regardless of what the cache does you will never see large ELO gains from it. From a theoretical perspective I think this is wrong. e.g. if a mate in 50 is on the board and I insert into the TT the major sub-trees leading to it, the engine should have little problem finding it. If you try to find it with an empty TT (or one using a "standard" replacement algorithm) the time required is likely to quite large. So, at least in theory, the TT is capable of huge gains in ELO. The problem is which policies lead to the most ELO gain. I think it's unlikely that any of the current engines have policies that are any where near optimal.

Training a small NN to choose the replacement policy and then extracting the underlying rules into simple polynomial formulas might be a good starting point for testing your theories. Even a small NN is likely to be too slow for this task but extracting the rules to a form that is easy to calculate would work. During testing and training the speed loss would have to be compensated for as the NN will slow the engine down quite a bit. Once the info is extracted a "normal" speed during final testing would be advisable.

Regards,

Zenmastur

Dann Corbit · Post by **Dann Corbit** » Sat Oct 19, 2019 6:35 am

Zenmastur wrote: ↑Sat Oct 19, 2019 6:16 am {snip}
From a theoretical perspective I think this is wrong. e.g. if a mate in 50 is on the board and I insert into the TT the major sub-trees leading to it, the engine should have little problem finding it.
{snip}

The question is, how did the mate in 50 get into the hash table in the first place?
It had to be computed. That was a big cost that will still have to be incurred.
Of course, a transposition on the chessboard will need to find the entry in the table.
But the speed of the cache won't make a big difference.
And a cache entry is likely to be overwritten, because the cache will be a lot smaller than main memory.

It is, of course, correct that a big cache will speed up computations. But it is only a small linear factor.

Aside to the original point:
I do not think that the data contained in the GPU evaluation is simple. Quite the opposite. I suspect it is so complex that we cannot understand it even if we were to decipher it somehow.

Zenmastur · Post by **Zenmastur** » Sat Oct 19, 2019 9:01 am

Dann Corbit wrote: ↑Sat Oct 19, 2019 6:35 am
Zenmastur wrote: ↑Sat Oct 19, 2019 6:16 am {snip}
From a theoretical perspective I think this is wrong. e.g. if a mate in 50 is on the board and I insert into the TT the major sub-trees leading to it, the engine should have little problem finding it.
{snip}
The question is, how did the mate in 50 get into the hash table in the first place?
It had to be computed. That was a big cost that will still have to be incurred.
Of course, a transposition on the chessboard will need to find the entry in the table.
But the speed of the cache won't make a big difference.
And a cache entry is likely to be overwritten, because the cache will be a lot smaller than main memory.

It is, of course, correct that a big cache will speed up computations. But it is only a small linear factor.

Aside to the original point:
I do not think that the data contained in the GPU evaluation is simple. Quite the opposite. I suspect it is so complex that we cannot understand it even if we were to decipher it somehow.

Like I said, I put it there. Of course in real life this isn't likely to happen on it's own. But it does point out that the TT has a huge ELO potential. The fact that no engine gets huge ELO gains from it's TT is a result of several factors. One of these is that no system has sufficient ram that every evaluated position can be stored in memory. If they did we would see greater NPS increases the deeper we searched because a greater percentage of the positions would have been seen before and the evaluation would be stored in RAM.

No person on this board can tell you which nodes will be most useful in the cache. i.e. cause one or more (or the largest number of ) cut-offs. NN's will find near optimal TT replacement policies. Then we just need to extract the policies from the NN and implement it in an algorithm that runs fast. It must run fast because TT replacement has to be fast to be useful. I'm not saying it will be easy to extract the policies from the NN but it will be easier if the NN is relatively small. So one should start with the smallest useful NN and then see how difficult it is to extract the policy. Once this is done larger NN can be used if it's felt that more gains can be had.

Just an idea.

Regards,

Zenmastur

corres · Post by **corres** » Sat Oct 19, 2019 9:16 am

Dann Corbit wrote: ↑Sat Oct 19, 2019 6:35 am ...
Aside to the original point:
I do not think that the data contained in the GPU evaluation is simple. Quite the opposite. I suspect it is so complex that we cannot understand it even if we were to decipher it somehow.

I agree.
The GPU (more exactly the NN) evaluation contain the result of the many millions games played during self learning. Enhancing the number of played games and the measure of NN we can gain engines playing better and better games.
It is pity but bigger NN needs more powerful hardware.

towforce · Post by **towforce** » Sat Oct 19, 2019 10:03 am

corres wrote: ↑Sat Oct 19, 2019 9:16 amThe GPU (more exactly the NN) evaluation contain the result of the many millions games played during self learning. Enhancing the number of played games and the measure of NN we can gain engines playing better and better games.
It is pity but bigger NN needs more powerful hardware.

What does "GPU" mean in this context?

For solving specific problems, a big NN can be worse than a small one. When a human has done a task enough times, they can do it quickly without conscious thought because fast NN pathways for doing that task get built. However, chimps have much smaller brains than us, but they can still learn simple video games, and when they do, they can easily beat humans because their reaction times are MASSIVELY faster than ours.

More tasks where monkeys outperform humans: the matching pennies game (link), and willingness to change tactics (link).

Going even more extreme, a housefly has complex behaviours in terms of flight control (including landing), walking on six legs, feeding, mating and living life before it can fly (among others), but has a brain size of only around 100,000 neurons (the human brain has around 100,000,000,000 neurons). This shows that good behaviour for solving some complex problems can be encoded in a more simple algorithm than you'd expect.

towforce · Post by **towforce** » Sat Oct 19, 2019 10:38 am

towforce wrote: ↑Sat Oct 19, 2019 1:53 am2. Codify a set of linear relationships between the pieces, then use linear programming algorithms to generate expressions for these from a correctly evaluated set of samples. If doing this at the surface level isn't good enough, iteratively deepen it by adding relationships between relationships using the method of divided differences to keep it linear (in order to be able to find optimal solutions quickly).

Having done this, if the resulting expression has a large number of parts, there would be an easy way to simplify it: run the optimisation again, requiring (as a linear programming condition) that it get the same result, but this time, instead of optimising for the result, optimise for the maximum number of weights in the result which are zero. Parts of the expression with a zero weight could then be discarded from the solution, hopefully making the final expression much simpler.

towforce · Post by **towforce** » Sat Oct 19, 2019 11:48 am

A quick thought experiment which, admittedly, entails a combination of optimism and plucking numbers out of the air.

Suppose that, using methods discussed in this thread, a set of 100,000 linear expressions could evaluate a chess position quite accurately. There are, on average, 30 legal moves in a chess position. This would mean that choosing a move would entail calculating 30 x 100,000 = 3 million linear expressions. A computer could do this as quickly as human perception, making move selection, from a human perspective, instantaneous.

towforce · Post by **towforce** » Sat Oct 19, 2019 12:36 pm

Talking about static evaluation, I think it's a shame that AlphaZero uses a combination of minimax search and well-trained NN (though admittedly it generated a much smaller game tree than its opponent, Stockfish). To REALLY demonstrate their prowess in the field of NN, they should make a NN that can beat StockFish in a 1 ply search - evaluating only the legal moves in the current position.

Has anyone ever done a rough plot of number of neurons v elo rating in a static evaluation? Maybe the graph rises exponentially (or is "S" shaped), meaning that the next 1 elo point increase requires more extra neurons than the last one did.

There Might Exist Simple Rules For Accurate Position Evaluation

There Might Exist Simple Rules For Accurate Position Evaluation

Re: There Might Exist Simple Rules For Accurate Position Evaluation

Re: There Might Exist Simple Rules For TT replacement

Re: There Might Exist Simple Rules For TT replacement

Re: There Might Exist Simple Rules For TT replacement

Re: There Might Exist Simple Rules For TT replacement

Re: There Might Exist Simple Rules For TT replacement

Re: There Might Exist Simple Rules For Accurate Position Evaluation

Re: There Might Exist Simple Rules For Accurate Position Evaluation

Re: There Might Exist Simple Rules For Accurate Position Evaluation