"Perhaps it is because I simply don't understand your answer. Just storing data in another form and reading it back for using it is not learning. So if that is what you ar doing it would be a "no", and not a "yes, in a way"."hgm wrote:Perhaps it is because I simply don't understand your answer. Just storing data in another form and reading it back for using it is not learning. So if that is what you ar doing it would be a "no", and not a "yes, in a way".Michael Sherwin wrote:1. Asked and answered several times. But okay once more. Part a) yes in a way because before the search all prior knowledge is loaded into the hash table then the search learns from the data and selects a move.
"No self learning was employed". So what learning was employed, then? How did you fill the learn file, who played the games from which the WDL statistics were taken? I don't understand the secod sentence at all, but I don't think it would answer ay question I have anyway.Part b) No self learning was employed against Rybka. However, that is immaterial because Romi's learning opposes Romi's natural evaluation function and causes it to return a different result if Romi is losing.
This sounds like it is just an opening book, and when you are out of book, you are out. You say it is no book, but everything that only works up to a point, and then not at all, is by definition an opening book. Whether you first store the book in the hash table doesn't make it less than a book. That is totally different from AlphaZero learning, which learned to play good moves in any position. If it would not have learned that, it would have reverted to a random mover very early in the game.2. WDL is learned best line only if stats are good and when that ends and there is absolutely no subtree to load into the hash table then Romi at least has played a line up to then that it has performed better at in the past so Romi is still better off at that point than without the learning.
I don't understand how you can get very deep in the game this way. Even if you play millios of games to learn. Perft 6 (3 moves into the game) is already 119M. OK, not every move is acceptable, but even with just 5 playable moves out of 25 you would only get to move 6 with 100 million games. You can record deeper lines in the book, of course, but there doesn't seem to be any chance you would ever play the same line as Rybka very long, if you were not close in strength to Rybka. (And even then...) Otherwise you would not need the help of the learn file, if you would play all the Rybka moves by yourself.
In Romi's RL for the search to learn from the stored data it is loaded into the hash table before the search. Then the search learns from that data to produce a different result. If that does not happen then there is no learning.
""No self learning was employed". So what learning was employed, then? How did you fill the learn file, who played the games from which the WDL statistics were taken?"
Romi started with a completely empty learn file against Rybka and played from a themed starting position. So the match games themselves supplied the WDL and RL value. Remember WDL is not used in RL, only the RL value is.
" You say it is no book, but everything that only works up to a point, and then not at all, is by definition an opening book."
I have said numerous times that WDL is not used in RL. There is no book used in RL. It just so happens that a tree structure can be used for a book and RL at the same time. You can't see past the book part.
Engine vs engine matches tend not to explore the full width of the tree. If they did Romi would never have shown an improvement in Marc LaCrosse testing or advance at WBEC. If Romi can differentiate between 1.d4 and 1.e4 which move leads to better results for Romi then Romi has benefited. After a million games that differentiation goes a lot higher into the tree than the first move and guides Romi to positions that it scores better at. And positions where Romi does not do so well at the penalties will shy it away from those lines. In highly trafficked openings Romi can then play very well into even the endgame.