On-line engine blitz tourney August

Joost Buijs · Post by **Joost Buijs** » Tue Aug 30, 2022 7:27 pm

chrisw wrote: ↑Tue Aug 30, 2022 5:59 pm If I understood that right, you have a massive continuously updated hash lookup of WDL and the task of the NNUE is to interpolate the gaps in the data.

Measuring progress is difficult, given the lack of opposition, mine tried sparring with my HCE(s) but never reached them, which was the point I switched to backgammon. The whole idea really was to organise a continuous throughput from games to train to test plus the tools to observe progress, Backgammon worked fine for this with meaningful progress measured in minutes, then move on to the real target - chess.

This is probably the way you can look at it.

The last year I spent most of my time on training a network for International Draughts, the longer I was busy with it I realized that the size and topology of the network is not very critical, the quality of the data used for training is the most important factor. Training on weird positions that will never occur in high quality games only weakens the network, on the other hand you got to have enough spread to make the network understand material unbalance. The same holds for Chess, for Chess you can download zillions of high quality games, this is a big advantage.

abulmo2 · Post by **abulmo2** » Thu Sep 01, 2022 12:47 am

Joost Buijs wrote: ↑Tue Aug 30, 2022 4:53 pm [
The Othello network is pretty straightforward, it has 129 inputs, 64 inputs for each color and 1 input for the color to move. It's a fully connected network with 256x32x32x1 neurons and incremental update of the first layer (like NNUE). Because the changes on the board after a move can be quite large (larger than with chess) NNUE seems to be somewhat less effective.

I didn't have any data that I could use for training, so I started by playing 1.8 million fast games (10 msec. per move) with root shuffling using a random network as evaluation. On my 32 core AMD this took about 10 hours. In the next step I converted the games to positions keeping track of the WDL by using a binary tree indexed by the positions hash, I used the winning percentage (2W+D) / 2(W+D+L) for labeling the positions used for training. Somehow using the win percentage as label gives me better results than plain logistic regression with 0 and 1.

After repeating this process several times (playing games, updating the binary tree, training the network) the program got better and better. Later I started using longer thinking times which made the whole process rather time consuming. One restriction is the maximum size of the binary tree that has to be in memory during the update process, otherwise it would be way to slow. In practice I never encountered the situation that the binary tree ran out of memory, the only explanation that I have is that the number of relevant positions generated during a game is not so huge as one would expect.

How strong is your NN based othello engine? Did you test it against other Othello program?
There are a few papers about NN Othello engines. They usually test them against my program Edax but at a ridiculously low level. Forest by Olivier Casile is a pretty strong program using a NN for its evaluation, but not as strong as Edax. Edax uses a pattern based evaluation function, ie a single neuron but with many thousand inputs and its latest evaluation data is now 20 year old... Although this evaluation function is already pretty accurate, Edax mostly gets its strength from its speed.

Joost Buijs · Post by **Joost Buijs** » Thu Sep 01, 2022 7:05 am

abulmo2 wrote: ↑Thu Sep 01, 2022 12:47 am How strong is your NN based othello engine? Did you test it against other Othello program?
There are a few papers about NN Othello engines. They usually test them against my program Edax but at a ridiculously low level. Forest by Olivier Casile is a pretty strong program using a NN for its evaluation, but not as strong as Edax. Edax uses a pattern based evaluation function, ie a single neuron but with many thousand inputs and its latest evaluation data is now 20 year old... Although this evaluation function is already pretty accurate, Edax mostly gets its strength from its speed.

I only tested it against my old Othello program from the nineties, this is a single threaded program which uses 13 bit patterns for evaluation. At that time the program was sub-top, for instance at the IOS Open II (1996) it got second, it even won once from Logistello. The NNUE version is clearly stronger than this.

I just wanted to see if it is possible to get something meaningful out of an Othello network, it seems to work but I have no idea how strong it actually is. The current version can only play against itself, I'm thinking about making it multi-threaded and adding a small GUI to make it somewhat easier to test it against other programs.

On-line engine blitz tourney August

Re: On-line engine blitz tourney August

Re: On-line engine blitz tourney August

Re: On-line engine blitz tourney August