A simple book learning method

Cardoso · Post by **Cardoso** » Thu Jun 12, 2008 10:57 pm

Hi,
recently my engine lost a game by making a bad move because was too deep to find the problem. In fact even after moving forward several moves ahead it still couldn't see it would loose the game.

So I thought a bit on the situation of creating a book with a learning scheme and came up with this:

Having a collection of games my engine played and analyse them one at a time.
But instead of starting from the 1st move we would start from the last move played in the game by the engine and do a search with an extended time/depth.
And we would work backwards by 2 plies at a time (since it's the engine turn we are interested in) until we reach the first move played in the game by the engine.
During the game analisis (the learning search) I would keep a separate hash table in wich I would store permanently the position resulting of making the best move found in every search the engine does.
And as I search I would probe this seperate hash table before the normal hash table.
Since this hash table has entries resulting of very deep searches it gives better values for propagating to the root.
Since we are working backwards to the begining of the game we are effectivelly generating more precise information of the later stages of the game and we can use that data to make a more precise search at the first stages of the game. We are propagating more precise information to the early stages of the current game.
Now at the begining of the game and while we make a learning search we can avoid bad lines more effectivelly because that seperate hash table has valuable and very deep information we can use.

Maybe an even more simple a aproach would be simply get rid of that separate hash table and use just the normal hash table.
Since we are working backwards we are probing the hash that has stuf from a search made two plies deeper and we are creating new entries with more precise information.
As we move two plies towards the begining of the game we are forming a better judgement of this game in the first stages of the game and consequentely making better moves (positions) we would store in a book file.

What do you people thing on this?
Do you thing using the hash table in this manner it would propagate more precise information towards the root?

best regards,
Alvaro Cardoso

Dann Corbit · Post by **Dann Corbit** » Fri Jun 13, 2008 4:24 am

Cardoso wrote:Hi,
recently my engine lost a game by making a bad move because was too deep to find the problem. In fact even after moving forward several moves ahead it still couldn't see it would loose the game.

So I thought a bit on the situation of creating a book with a learning scheme and came up with this:

Having a collection of games my engine played and analyse them one at a time.
But instead of starting from the 1st move we would start from the last move played in the game by the engine and do a search with an extended time/depth.
And we would work backwards by 2 plies at a time (since it's the engine turn we are interested in) until we reach the first move played in the game by the engine.
During the game analisis (the learning search) I would keep a separate hash table in wich I would store permanently the position resulting of making the best move found in every search the engine does.
And as I search I would probe this seperate hash table before the normal hash table.
Since this hash table has entries resulting of very deep searches it gives better values for propagating to the root.
Since we are working backwards to the begining of the game we are effectivelly generating more precise information of the later stages of the game and we can use that data to make a more precise search at the first stages of the game. We are propagating more precise information to the early stages of the current game.
Now at the begining of the game and while we make a learning search we can avoid bad lines more effectivelly because that seperate hash table has valuable and very deep information we can use.

Maybe an even more simple a aproach would be simply get rid of that separate hash table and use just the normal hash table.
Since we are working backwards we are probing the hash that has stuf from a search made two plies deeper and we are creating new entries with more precise information.
As we move two plies towards the begining of the game we are forming a better judgement of this game in the first stages of the game and consequentely making better moves (positions) we would store in a book file.

What do you people thing on this?
Do you thing using the hash table in this manner it would propagate more precise information towards the root?

best regards,
Alvaro Cardoso

Rybka 3 will feature a permanent hash table.
It makes sense to me to store pv nodes on disk.
I don't think I would want to store the whole hash table.

Another interesting idea would be to keep separate hash tables segregated by material signature.

Cardoso · Post by **Cardoso** » Fri Jun 13, 2008 8:53 pm

Rybka 3 will feature a permanent hash table.
It makes sense to me to store pv nodes on disk.
I don't think I would want to store the whole hash table.

Another interesting idea would be to keep separate hash tables segregated by material signature.

Thanks Dan for your answer.
But could you please elaborate a little more on the details and purposes of those separate hash tables segregated my material signature?
And about Rybka's permanent hash table do you know the details of the implementation?

thanks again,
Alvaro

Dann Corbit · Post by **Dann Corbit** » Fri Jun 13, 2008 10:09 pm

Cardoso wrote:
Rybka 3 will feature a permanent hash table.
It makes sense to me to store pv nodes on disk.
I don't think I would want to store the whole hash table.

Another interesting idea would be to keep separate hash tables segregated by material signature.
Thanks Dan for your answer.
But could you please elaborate a little more on the details and purposes of those separate hash tables segregated my material signature?

The fundamental idea is to prevent constant overwrites and to create something worth saving. If I set up a different hash table for each material signature, the probability of overwriting a given entry is much lower. It can cause problems if you have a long sequence of exchanges since you will have to load a great number of them into memory. Hence I would store them as FastDB databases (it is not as bad as it sounds because FastDB is a memory mapping database.) The idea only works well on machines with large memory (many GB, obviously you need a 64 bit CPU and Operating System).

Initially, it will underperform badly, but over time, it will build up some value.

It's just another fun thing to try.

The idea of storing all pv nodes is a good one -- keep all the relevant hash table details (score, projected move, depth, etc.) along with the position. If you are going to bother storing them to disk, then it is also a good idea to keep some simple win/lose/draw stats with it also. After all, if something smells like a rose but you keep losing with it, it is a good idea to know it.

I think that Sloppy has a good start towards analytical storage in the opening book and it is something that the Sloppy author also is planning on improving.

And about Rybka's permanent hash table do you know the details of the implementation?

http://rybkaforum.net/cgi-bin/rybkaforu ... esc&body=1

thanks again,
Alvaro

Harald Johnsen · Post by **Harald Johnsen** » Sat Jun 14, 2008 10:09 am

The mechanics of a persistent hash are actually quite simple. When a position is searched, it is stored in the persistent hash with the depth at which it was searched. Later, the result is used when it has a high-enough depth.

Vas

Note that some engines where allready doing that in the past.
iirc glaurung 1.2 has that feature, except that he checks the duration of the thinking rather than the depth (for retrieving and storing the score/move).

HJ.

Cardoso · Post by **Cardoso** » Sat Jun 14, 2008 9:03 pm

When you say to set up a different hash table for each material signature how would you compute the material signature?
I suppose I would have to count the number of pieces of each type and color and not factor in the squares, but then how would I compute the signature?

thanks again,
Alvaro

ilari · Post by **ilari** » Sun Jun 15, 2008 1:08 pm

Dann Corbit wrote:I think that Sloppy has a good start towards analytical storage in the opening book and it is something that the Sloppy author also is planning on improving.

Thanks. The big issue I have is combining the opening book and learning results seamlessly. Currently it is seamless because the learning is purely based on the game result, not search results. But that's not optimal, and if I add UCI support, Sloppy won't often even know how the game ended.

Michael Sherwin · Post by **Michael Sherwin** » Mon Jun 16, 2008 1:47 am

ilari wrote:
Dann Corbit wrote:I think that Sloppy has a good start towards analytical storage in the opening book and it is something that the Sloppy author also is planning on improving.
Thanks. The big issue I have is combining the opening book and learning results seamlessly. Currently it is seamless because the learning is purely based on the game result, not search results. But that's not optimal, and if I add UCI support, Sloppy won't often even know how the game ended.

Time for an upgraded UCI protocol. The current one is just too short sighted.

Harald Johnsen · Post by **Harald Johnsen** » Mon Jun 16, 2008 9:26 am

You can ask yourself if the result of the game is really needed because you surely know the future outcome long before the end of the game.

Anyway i don't think that using the score of the game is of any help for a learning file. What is intersting is the pv score of the whole game.

You can surely find a formulae to apply a pv score to your learning score, with a weight that is a function of the move count so that moves far away from the root have a very little weight on the learning score (so this is exactly the opposite of what most ppl are doing today).

The point is to give more weight to positions that will mostly occur and not give weight to some obscur endgame tactics/blunder that you won't see in more than onr game.

And yes i know that there are gambits and other things at the begining of the game that give instable scores, but this is not a problem with a correct weight function.

HJ.

michiguel · Post by **michiguel** » Mon Jun 16, 2008 9:12 pm

Michael Sherwin wrote:
ilari wrote:
Dann Corbit wrote:I think that Sloppy has a good start towards analytical storage in the opening book and it is something that the Sloppy author also is planning on improving.
Thanks. The big issue I have is combining the opening book and learning results seamlessly. Currently it is seamless because the learning is purely based on the game result, not search results. But that's not optimal, and if I add UCI support, Sloppy won't often even know how the game ended.
Time for an upgraded UCI protocol. The current one is just too short sighted.

Yes, you can "upgrade" to winboard.

Miguel

A simple book learning method

A simple book learning method

Re: A simple book learning method

Re: A simple book learning method

Re: A simple book learning method

Re: A simple book learning method

Re: A simple book learning method

Re: A simple book learning method

Re: A simple book learning method

Re: A simple book learning method

Re: A simple book learning method