An AlphaZero inspired project

brianr · Post by **brianr** » Thu Dec 21, 2017 6:49 pm

I did a "hello world" chess MCTS (about 70 loc) using this example:
http://mcts.ai/code/python.html
and this nice framework:
https://github.com/niklasf/python-chess

For chess, I found that the "iteration" parameter is very important. With iteration ranging from about 100 to 1,000, mates only 1 or 2 moves away could be determined. Then, I tried KQvK, and iteration of 10,000 finds it. I did not experiment much to find an approximate minimum for iteration, but it does seem to be a key parameter.

This is only MCTS without any NN, of course.

Thank you for sharing ZeroFish.

Evert · Post by **Evert** » Thu Dec 21, 2017 11:20 pm

I think a more interesting test would be KBNK, because that requires some fairly specific knowledge (it would be interesting to know whether A0 can handle it, though I suspect not).

trulses · Post by **trulses** » Fri Dec 22, 2017 10:00 pm

brianr wrote:I did a "hello world" chess MCTS (about 70 loc) using this example:
http://mcts.ai/code/python.html
and this nice framework:
https://github.com/niklasf/python-chess

For chess, I found that the "iteration" parameter is very important. With iteration ranging from about 100 to 1,000, mates only 1 or 2 moves away could be determined. Then, I tried KQvK, and iteration of 10,000 finds it. I did not experiment much to find an approximate minimum for iteration, but it does seem to be a key parameter.

This is only MCTS without any NN, of course.

Thank you for sharing ZeroFish.

Thanks for sharing, indeed the iteration parameter helps, it's difficult to balance the nodes per search and having enough games played per training epoch. Have you tried varying the exploration constant in the node selection? In KQvK how far away was the mate spotted by the 10k simulation search?

Evert wrote:I think a more interesting test would be KBNK, because that requires some fairly specific knowledge (it would be interesting to know whether A0 can handle it, though I suspect not).

This is on my radar. The KBNvK scenario is quite sparse (random piece placements gives mate in roughly 46-48 plies on average) and unlike KQvK the winning signal is quite easy to delete, so it's probably going to be a while before it finds any mates. I might have to generate a reverse curriculum for this scenario to speed it up but I'm not sure this will yield a network that generalizes well.

I'm still working on faster KQvK convergence and I've gotten it down to about 2 hours now with a new setup but the quality of play seems a bit odd. I want this to get quite fast before I start experimenting with more difficult scenarios.

AlvaroBegue · Post by **AlvaroBegue** » Fri Dec 22, 2017 11:48 pm

I don't see the interest in training tablebases for endgames that are already covered by EGTBs. Doesn't it make more sense to start with 6-men EGTBs and get the neural network to learn everything else?

trulses · Post by **trulses** » Sat Dec 23, 2017 7:28 am

AlvaroBegue wrote:I don't see the interest in training tablebases for endgames that are already covered by EGTBs. Doesn't it make more sense to start with 6-men EGTBs and get the neural network to learn everything else?

To answer your question directly, yes it would. Just so that's clear, I have no intention of using this technique to create tablebases or anything like that, they would just be much weaker and much less efficient.

However, to get the technique to work on the full of game of chess while running on "regular hardware", I might have to resort to all sorts of tricks to speed up the convergence. These endgames are just a test-bed for me to try out different sets of hyper-parameters and ideas just for this purpose. Different types of endgames have a nice range of complexity and they all come with a perfect expert (EGTBs), it's like the perfect lab for chess mini-games.

So under these conditions, if I can't get it to work on endgames then I certainly won't be able to get it to work on the full game of chess. I only have one machine with a pretty old 4 core CPU and fairly old GPU, so efficiency is extremely important, that's the goal here.

Evert · Post by **Evert** » Sat Dec 23, 2017 9:12 am

AlvaroBegue wrote:I don't see the interest in training tablebases for endgames that are already covered by EGTBs. Doesn't it make more sense to start with 6-men EGTBs and get the neural network to learn everything else?

Depends on your goal. In this case, very clearly not the ability to play the end game, but validation of the method in a situation that is/could be difficult for it.

brianr · Post by **brianr** » Sat Dec 23, 2017 1:28 pm

trulses wrote:
Thanks for sharing, indeed the iteration parameter helps, it's difficult to balance the nodes per search and having enough games played per training epoch. Have you tried varying the exploration constant in the node selection? In KQvK how far away was the mate spotted by the 10k simulation search?

8/8/8/4k3/8/8/8/4KQ2 w - - 0 1
Shortest mate in 8 moves

CheckersGuy · Post by **CheckersGuy** » Sat Dec 23, 2017 5:07 pm

Evert wrote:
AlvaroBegue wrote:I don't see the interest in training tablebases for endgames that are already covered by EGTBs. Doesn't it make more sense to start with 6-men EGTBs and get the neural network to learn everything else?
Depends on your goal. In this case, very clearly not the ability to play the end game, but validation of the method in a situation that is/could be difficult for it.

If the method has a very high sucess rate it does come with a benefit. The NN uses much less memory compared to , let`say, 7piece endgame database. For that reason alone I would say it`s pretty intresting as it's basically a "compression".

hgm · Post by **hgm** » Sat Dec 23, 2017 7:30 pm

Evert wrote:I think a more interesting test would be KBNK, because that requires some fairly specific knowledge (it would be interesting to know whether A0 can handle it, though I suspect not).

Training just on KBNK seems a bad idea; it will be very hard to get any mate by accident, which could start off the learning. To get a good chance for a mate, it should already know it has to drive the King to a corner. And it is much easier to learn that with a more powerful piece, where you have a reasonable chance for accidental mates.

This shows that training for a very narrow task is a bad idea. If you train for KQK, KRK, KBBK and KBNK mates all at the same time, it will learn KBNK much faster, as KQK and KRK will teach it to involve the King, and to drive the bare King to edge or corner. With that knowledge KBBK should already be easy, and KBNK only requires learning about the good corner.

Pio · Post by **Pio** » Sat Dec 23, 2017 8:01 pm

Hi!

I agree with H.G.M.

One thing that should also improve the learning is to score the solutions with the smallest proof trees higher.

If you also make the network to understand symmetries it will also speed it up. For example when there are no pawns you could fix the side to move king to one of the eight triangles (including the diagonal border squares)

BR
Pio

An AlphaZero inspired project

Re: An AlphaZero inspired project

Re: An AlphaZero inspired project

Re: An AlphaZero inspired project

Re: An AlphaZero inspired project

Re: An AlphaZero inspired project

Re: An AlphaZero inspired project

Re: An AlphaZero inspired project

Re: An AlphaZero inspired project

Re: An AlphaZero inspired project

Re: An AlphaZero inspired project