mvk wrote:Zenmastur wrote: After reading the other threads, I noticed several people have the false impression that they needed (or wanted) to pre-search the book positions before making them into a book. I'm not sure what they were thinking about, but I don't think this will help and isn't needed.
I understand the desire to be confident that the book won't lead the engine astray. My solution to this problem is simple, unlike current books, never blindly follow book lines.
Until the move is fully vetted by the engine using the book, i.e. it's no longer in the early phases of active learning, search results are always checked before any decision about a move is made. This requires that engine search results be stored in each position record. These should always be consulted first. This does take some time the first few times the position is seen by the engine but after that it will revert to using game statistics. The point to doing it this way instead of pre-searching all moves to some depth is that the engine may never encounter some of these position. In which case the time was wasted.
Thanks for your explanation, it helps to understand your proposal. I hope you succeed with the project.
One of the main tenants I adopted very early when I was considering this design project was this: Because there will always be more work to be done than there are resources or time with which to do it, I want to defer work as long as possible or completely eliminate it in some cases. This is why pre-searching isn't used. They will get searched eventually, if and when they are seen. This eliminates any waste of effort. It has another effect. Positions that are very commonly seen will be searched deeper than most. They can be searched deeper than even the longest time controls would normally allow. This is great for removing even very subtle tactical errors from your book. Unfortunately it doesn't do much else. Although somewhat deeper searching should be done on very common positions, the time used for very deep searches is probably better spent else where.
Learning by most algorithms is slow. Sometimes it appears absolutely glacial in it's pace. This is a problem since the average branching factor of the search tree is 30+. I don't think the algorithms are necessarily to blame for slow learning. The major problem is that most of the useful information that could be used to learn is simply discarded because it occurs outside the domain of the learning algorithm. The implementations don't use all available information to adjust what the learning algorithm is processing. This causes a lot of time to be wasted learning information that is either worthless, already known, or has been previously learned and discarded. I want to eliminate this wasteful behavior.
A lot of what needs to be learned about a position is purely tactical in nature. Simply storing engine evaluations will remove about 90% of the need to learn. This is at least three orders of magnitude faster than trying to learn the same information by playing games and collecting statistics about them. A good example is when one side is threatened with great loss without compensation and does nothing to rectify the situation. The algorithm shouldn't voluntarily engage in testing this type of position by playing games so it can gain statistical knowledge about what will happen. There is already good statistical evidence that demonstrates that loss of material without compensation is bad so there is no reason to re-demonstrate this on every position in the opening book by playing games from bad positions so we can collect statistical data on the outcomes. And yet, this is exactly what will happen if the implementation blindly submits all positions to the same learning process.
I've tested several different MAB indexing algorithms and they all do about the same thing in this regard. They will play all legal moves to see what happens in each case. The frequency of the bad moves drops quickly as more games are played but by this time the damage is already done. The average game score has dropped and a large amount of time was wasted on games that no computer would ever play if it didn't have a book or had a more conventional book.
The net effect of all this wasted effort is that USEFUL learning is slowed because USELESS leaning will be dominating the limited resources available. The question is what's the best way to fix this problem?
One might think pre-filtering the moves that the statistical learning algorithm gets to see is a good method. I'm not so sure this is best. The UCB MAB indexing algorithm can be absolutely fantastic at providing variety to play and it does a very good job of balancing exploration with exploitation in most cases.
The underlying problem is the MAB indexing algorithm knows nothing about chess positions so it doesn't know that loss of material is bad. Further, it sees every position as a completely different problem about which it has no knowledge unless it has statistical information available. The only statistics it understands are W-L-D game statistics, the total number of times the position has been encountered and the number of times each individual move has been played from the position. So another way to solve the problem is to generate W-L-D statistics for each legal move on the fly by using the engines search results and a logistics equation that matches the engines performance curve under the full range of advantages and disadvantages. These "false" statistics would be added to any real statistics already gathered. The net effect of this ruse is to stop the playing of useless exploratory games caused by the algorithms lack of chess knowledge. Since there are considerably more bad move than good ones in most positions this will cause a large shift in where the learning takes place, and the "apparent" rate of learning for the much smaller pool of candidate moves should increase proportionately. The average game score produced will increase accordingly.
mvk wrote:Two things about this:
1. In my case the book is always the a fully negamaxed graph with dropout searches in every node. But just because a position is present in the book/graph/file, it doesn't mean it is always associated with a deep dropout search. Only the repertoire nodes, which is the subset in which the program can end up using the move selection algorithm, needs deep dropout searches, for the reason I gave. But non-reportoire moves in the file will usually be covered by significantly shallower searches. (Unless they have been repertoire in the past. This is a bit of a simplification and I won't bore anyone with the edge cases of this.)
I thought about negamaxing the engines search results to obtain better scores for each node. I think Bob Hyatt tried this with out any additional search information and he described the results as "iffy". So it's considerably more work because you need search results from a lot of descendants. These may not be available even if your storing all your searches. A modified version of LMR could be used to lessen the burden somewhat. But even with this addition I'm wondering if it's worth the effort. Most book lines aren't that deep the average is between 15 and 25 plies (depending on the size of the book of course). A single search can be deeper than this. The nodes that will benefit the most from this are the ones nearest the root. The positions that need the most help are the ones near the leaf nodes. There's not much you can do for these other than search deeper since they will have very few layers of descendants in the tree. In addition, these nodes are seen considerably less often on average and there is a lot more of them so searching them much deeper is going to require a lot of effort.
The other problem is that the evaluations returned from searches are only going to have a dominating influence on the move selection for a short period of time if statistics are initially lacking. This influence wanes as more statistics become available. After that happens they won't have much influence unless a radical change in the engines evaluation occurs.
What REALLY needs to be negamaxed IMO is the game statistics. This solves a longstanding problem when using games statistics for move selection. When a new move is found deeper in the tree that refutes a line of play the program won't be fooled by the older and better established game statistics that show that a line of play is sound. It will immediately head for the less well established but higher scoring new move that seems to refute the older lines of play. It will do this is because the MAB index for this move will suddenly jump because it hasn't been played enough compared to the other more established moves. This is because of the large imbalance in the game counts between the more established moves and a new move with few games in its statistics. The MAB indexing algorithm will give this move a much higher index because it hasn't been played enough considering its score. The change in behavior this causes is huge! It means that, unlike older books where you had to repeatedly beat the hell out of a well established move before the statistics would change enough to make the program try something else, the learning book will gladly change it's direction on a dime.
This could potentially cause a MAJOR paradigm shift in the way engines play the openings. It will make the book have a massive affinity for playing new theory if it's good for the engine. It won't necessarily avoid old lines that it now believes are worse but will play them considerably less often at least until the MAB indexes are brought back into balance though sufficient play of the new lines. I think this type of behavior has been heretofore unseen in computer opening play. It means that computers would have a much greater influence on current opening theory. Currently they contribute little if anything to opening theory because of the screwed up way the open is handled.
mvk wrote:Also, in expanding the book, that is adding new nodes, the effort is focussed on nodes that likely affect the repertoire. ...
Precisely the reason the game statistics should be negamaxed. One note I should add though, if you decide to try this, you don't actually want to move the game statistics from one node to the other and you definitely don't want to overwrite the "real" statistics for any position. Instead you simply negamax a pointer to the node that contains the pertinent statistics.
mvk wrote:...Given that, may I ask what in your proposal is the essential difference between "a book move that is still in the early phases of learning" (and therefore won't be blindly played if encountered during play), and a "a candidate move that is not in the book file at all" (for example, the move that the program finds on its own, a novelty, so to say?). Is there a difference in treatment in your scheme between such moves? From your description it isn't really clear. Or is it just a carrier of statistical information?
Well, I'm not sure exactly what you are asking since our terminology differs significantly, but I'll give it a shot. All moves that are not leaf nodes are candidate moves. If you just feed all legal moves to the UCB MAB indexing equation it will spit out a set of indexes. The one with the highest index is the one played. It will play every move with some non-zero probability. Even moves that lead to it getting mated (assuming this isn't inhibited in some way). Moves that produce good average game scores get played considerably more often of course. And moves that are winning are played with high probability. The equation can be tuned. It can be made to be VERY greedy or it can be made to explore all moves with equally high probability or anything in between these two extremes. Since the leaf nodes don't have sufficient statistical information stored to be useful for decision making the engine just does a normal search and makes a move assuming a search result of sufficient depth isn't available in that node. The primary difference is that while in-book the book routine calls the search routine and then records the appropriate information in the leaf record before making the move. If this move pushes the leaf node over the threshold then a new record is added to the main part of the book. Leaf nodes are stored separately, in a different file, and have a different record format than main book records. When a new record is added to the book, by this method or any other, all of it's descendants get added to the leaf node file. These records are small and are stored as a group to save space.
The only real difference between early stages of learning and the later stages are which statistics dominate the move selection process. Third party statistics, if they exist for a particular node, are discounted in such a way that they may dominate the decision making process early on, but will not dominate it for long. As more games are played the balance shifts to favor the engine statistics. If all statistics are sparse then the engines search evaluations will be used to generate false game statistics. (It actually does this regardless..) These are scaled so they are relatively small in magnitude so they will only dominate the decision process for a short time while enough real statistics are generated.
I'm not sure if this answers your questions or not.
mvk wrote:Let's say that how I do that is that I keep unsearched moves stored in the PGN archives, separate from the graph/book, available when one of the PGN-based expansion algorithms sees fit for search and inclusion in the graph. From what I can see the difference in approach doesn't seem fundamental?
I'm not sure I completely understand, but assuming I do, I'm not sure that I agree.
Zenmastur wrote:This is one of the reasons I want the book to be integrated into the engine and always "on". Right now this isn't the case so when I'm done with the analysis I have to manually enter it into a book.
mvk wrote:This other main idea, of harvesting search results form other places than dedicated searches, is fine of course. I think that is an orthogonal concept. My system doesn't exclude that possibility for example, I just don't do it now. At any point I could harvest engine logs and decide if what I see there can be used as new book nodes. (In practice I would still want to perform dropout searches, because every inner node has at least two moves scores of course as negamax doesn't work on a book graph otherwise. During game play you only search the best move. This is probably the reason I haven't gone in that direction.)
Concluding, as far as I can see, placing moves in the book file that you don't use during play seems merely a semantic difference, only observable in file size. Not a difference observable in learning capabilities of the system.
All moves that are in the main book file have a non-zero probability of being played if no special code is inserted dictating otherwise. All the move statistics matter, even the ones in moves that aren't being played very often. If you change or remove any of these statistics it will have an immediate effect on all future move selections until the MAB indexes are brought back into balance. I can give you a real example if you need further explanation of why and how this happens.
Zenmastur wrote:I would note that there is nothing in the dropout expansion approach that would preclude you from adopting a more traditional learning format in addition to what you have already done.
mvk wrote:I agree. I'm not tied to one expansion algorithm, not even to one move selection algorithm (although I will likely will stick to the path error mechanism for the latter). I was thinking for example of doing some kind of "trap detection", and steer the affinity of repertoire moves towards the more difficult parts of the tree using that.
I would recommend a version of UCB MAB to you if you kept game statistics. But without them that's not really possible.
Zenmastur wrote:One thing I forgot to mention. 219 ELO over a no book configuration is nothing to scoff at. I suspect that an additional 100 to 200 ELO is possible with additions to the learning features already in your book.
mvk wrote:Thank you! The cynic in me just attributes that number to the quality of opening play of the underlying engine
I think the cynic in you is dead wrong. I think its all the book!
Regards,
Zen
Only 2 defining forces have ever offered to die for you.....Jesus Christ and the American Soldier. One died for your soul, the other for your freedom.