Book learning

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Book learning

Post by bob »

Dann Corbit wrote:
Don wrote:
bob wrote:
Don wrote:
Dann Corbit wrote:
hgm wrote:Well, the example is very extreme, of course. But it could very well hapen that a variation that was believed to be good for the opponent was played a lot, and that the common defense against it was move A, which thus got many plays, but rather poor score. Then a novelty B was found in this position, which basically refuted the previous move. It was played a few times, much less than A, but with a very good score, which convinced people that it indeed was a refutation. So they stopped playing the previous move, and B never got the chance to get anywhere near the number of plays that A has. This makes scenarious where A has 40% out of 1000 plays while B has 60% out of 100 plays not unrealistic. Polyglot would continue to prefer the poor move by 400:60. And even worse, it would not discourage the preceeding move from being played, as all the 600 points scored against it because it was met by the inferior reply A continue to count, while they are in fact no longer relevant, as good players will meet it with move B for sure.

Now I know of course that serious book builders would select the games from which they build the book by date, pay attention to the ratings, etc., exactly to cleanse the input data from such misleading cues. But this is a case that could very well be handled automatically, by proper application of minimax. If the 40% vs 60% scores for move A and B would lead to fafor B over A in play by 90% vs 10%, the average score of this node should be set to 58% (90% x 60% + 10% x 40%), so 42% for the opponent's preceding move, based on 1/(0.81/100 + 0.01/1000) = 122 'virtual' games, rather than at 640 out of 1100 games (58%).
Why not mini-max the book with alpha-beta upon update? Doesn't that remove the problems of refutations being discovered converging slowly?
I think mini-max of the book is a very good way to customize a book for a given program. In general you want your program to come out of book with a position that it thinks is good. A significant percentage of Komodo's losses in tournament games happens when we come out of book with what Komodo considers a losing (or close to losing) position.

However I believe that for this to work the program being tuned must search every book position to ensure there is a valid choice (at least from the programs point of view.) For example:

In position P the book gives Ne5 and d4, but both moves are inferior to c3 which is not in the book. In this case, any mini-max that involves position P will be in error. It might be a good line if c3 is allowed and thus the computer will avoid it when playing white, or as black think it's a great position to force white into.

So my idea for mini-maxing the book is to evaluate every position and insert the computers choice, if different, into the book also. In the cases where the computer has it's own TN you might also consider verifying the new choice with additional thinking time to avoid gratuitously adding moves to the book (that might also get recursively expanded upon.)

I have never seen a book that didn't have some kind of omission such as this. Cilkchess lost a game to "King" one year because the book did not consider an obvious choice. In general I have not payed a lot of attention to the book in our programs and it has cost us dearly. There are a few programs that do not test very well or rank very high on the rating lists but get much better results in tournaments because of exceptionally good book preparation. Once you get a terrible position it's difficult to recover even against a program 2 or 3 hundred ELO weaker.
There are ways to handle some of that, but I don't think minimaxing is the solution. We did that in Cray Blitz. And in early Crafty versions. The problem is you will _always_ accept gambits since that move wins material and the rest do not.

You can do this minimaxing at least two different ways. The simplest is to simply follow each book line to the end, perhaps do a search there, and then minimax the scores back to the root. Only catch is that you have to change things for black, since when you play black you now want the lowest score, not the highest, if you use the same book for both sides. The other is to search down the book lines, but do a real search at each node as well to see if there is a better move than the book alternative(s).

We finally ended up with a manual system in Cray Blitz. We did the search, but only printed things out when we "thought" we had a better move than the book move, or when the search said that a book move flagged as good really appeared to be bad. That was a pain...
Yes, there is not perfect solution.

I'm not so sure the gambit stuff is a problem - it was a few years ago but programs have come a long way. Just for fun I tested the smith morra gambit with Komodo immediate after the pawn is accepted by black. I have to say that I agree completely with the assessment:

after 1.e4 c5 2. d4 cxd4 3. c3 dxc3

At 5 ply and beyond the score is within about 1/5 pawn of even. On earlier depths Komodo even thinks white has an advantage, but at 13 ply and beyond black is given a small advantage (which in my opinion is correct.) Most gambits are at least slightly unsound against a strong player with a cool head.

At 19 ply Komodo thinks black has an 18/100 pawn advantage.

Houdini at similar depths has the score very close to zero. We could argue about which assessment is more accurate but the point is that neither program is very concerned about being down a pawn.

With mini-max, we are not really going to evaluate THAT exact position either, we are going to see a tree of possibilities and then analyze THEM, which is likely to get us much closer to the truth about whether to offer the pawn or not.

I think your real point is that programs do not always evaluation positions evenly, they all over-value some position and under-value others. I agree with that.

However I believe the real point of minimax is to get positions that YOUR program likes. It's objective goodness is usually a matter of debate anyway but my working theory right now is that if you can come out of book with positions that your program likes, it's hard to do better than that because it means the pieces are more in harmony with your programs way of thinking.

Don't you just hate it when you come out of book and the program moves the pieces back to squares it prefers?
When the gambit positions are analyzed at 35 ply or so, the engines properly understand them. So they cease to be a problem.

15 years ago, Evans gambit positions would withstand 48 hours of analysis and the engines would still cough up stupidity. Today, the engines choose the right moves after an hour.
That's the problem. Except for correspondence chess, anyway. :)

And let a good GM play the evans against a computer at blitz, and you see why the book is important... And not just the Evans either.
yoshiharu
Posts: 56
Joined: Sat Nov 11, 2006 11:14 pm

Re: Book learning

Post by yoshiharu »

Dann Corbit wrote:
If pv nodes in the regular hash table are overwritten, then I suggest a separate pv node hash table that is permanent.

There are only a few million important pv positions in total (for book usage anyway).
Hadn't Ed Schroeder tried something along these lines in Rebel's heyday?
I think it was called Rebel's EOC.
Probably today this could be pursued better due to better hardware, who knows...

Cheers, mr
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: Book learning

Post by Don »

bob wrote:
Dann Corbit wrote:
Don wrote:
bob wrote:
Don wrote:
Dann Corbit wrote:
hgm wrote:Well, the example is very extreme, of course. But it could very well hapen that a variation that was believed to be good for the opponent was played a lot, and that the common defense against it was move A, which thus got many plays, but rather poor score. Then a novelty B was found in this position, which basically refuted the previous move. It was played a few times, much less than A, but with a very good score, which convinced people that it indeed was a refutation. So they stopped playing the previous move, and B never got the chance to get anywhere near the number of plays that A has. This makes scenarious where A has 40% out of 1000 plays while B has 60% out of 100 plays not unrealistic. Polyglot would continue to prefer the poor move by 400:60. And even worse, it would not discourage the preceeding move from being played, as all the 600 points scored against it because it was met by the inferior reply A continue to count, while they are in fact no longer relevant, as good players will meet it with move B for sure.

Now I know of course that serious book builders would select the games from which they build the book by date, pay attention to the ratings, etc., exactly to cleanse the input data from such misleading cues. But this is a case that could very well be handled automatically, by proper application of minimax. If the 40% vs 60% scores for move A and B would lead to fafor B over A in play by 90% vs 10%, the average score of this node should be set to 58% (90% x 60% + 10% x 40%), so 42% for the opponent's preceding move, based on 1/(0.81/100 + 0.01/1000) = 122 'virtual' games, rather than at 640 out of 1100 games (58%).
Why not mini-max the book with alpha-beta upon update? Doesn't that remove the problems of refutations being discovered converging slowly?
I think mini-max of the book is a very good way to customize a book for a given program. In general you want your program to come out of book with a position that it thinks is good. A significant percentage of Komodo's losses in tournament games happens when we come out of book with what Komodo considers a losing (or close to losing) position.

However I believe that for this to work the program being tuned must search every book position to ensure there is a valid choice (at least from the programs point of view.) For example:

In position P the book gives Ne5 and d4, but both moves are inferior to c3 which is not in the book. In this case, any mini-max that involves position P will be in error. It might be a good line if c3 is allowed and thus the computer will avoid it when playing white, or as black think it's a great position to force white into.

So my idea for mini-maxing the book is to evaluate every position and insert the computers choice, if different, into the book also. In the cases where the computer has it's own TN you might also consider verifying the new choice with additional thinking time to avoid gratuitously adding moves to the book (that might also get recursively expanded upon.)

I have never seen a book that didn't have some kind of omission such as this. Cilkchess lost a game to "King" one year because the book did not consider an obvious choice. In general I have not payed a lot of attention to the book in our programs and it has cost us dearly. There are a few programs that do not test very well or rank very high on the rating lists but get much better results in tournaments because of exceptionally good book preparation. Once you get a terrible position it's difficult to recover even against a program 2 or 3 hundred ELO weaker.
There are ways to handle some of that, but I don't think minimaxing is the solution. We did that in Cray Blitz. And in early Crafty versions. The problem is you will _always_ accept gambits since that move wins material and the rest do not.

You can do this minimaxing at least two different ways. The simplest is to simply follow each book line to the end, perhaps do a search there, and then minimax the scores back to the root. Only catch is that you have to change things for black, since when you play black you now want the lowest score, not the highest, if you use the same book for both sides. The other is to search down the book lines, but do a real search at each node as well to see if there is a better move than the book alternative(s).

We finally ended up with a manual system in Cray Blitz. We did the search, but only printed things out when we "thought" we had a better move than the book move, or when the search said that a book move flagged as good really appeared to be bad. That was a pain...
Yes, there is not perfect solution.

I'm not so sure the gambit stuff is a problem - it was a few years ago but programs have come a long way. Just for fun I tested the smith morra gambit with Komodo immediate after the pawn is accepted by black. I have to say that I agree completely with the assessment:

after 1.e4 c5 2. d4 cxd4 3. c3 dxc3

At 5 ply and beyond the score is within about 1/5 pawn of even. On earlier depths Komodo even thinks white has an advantage, but at 13 ply and beyond black is given a small advantage (which in my opinion is correct.) Most gambits are at least slightly unsound against a strong player with a cool head.

At 19 ply Komodo thinks black has an 18/100 pawn advantage.

Houdini at similar depths has the score very close to zero. We could argue about which assessment is more accurate but the point is that neither program is very concerned about being down a pawn.

With mini-max, we are not really going to evaluate THAT exact position either, we are going to see a tree of possibilities and then analyze THEM, which is likely to get us much closer to the truth about whether to offer the pawn or not.

I think your real point is that programs do not always evaluation positions evenly, they all over-value some position and under-value others. I agree with that.

However I believe the real point of minimax is to get positions that YOUR program likes. It's objective goodness is usually a matter of debate anyway but my working theory right now is that if you can come out of book with positions that your program likes, it's hard to do better than that because it means the pieces are more in harmony with your programs way of thinking.

Don't you just hate it when you come out of book and the program moves the pieces back to squares it prefers?
When the gambit positions are analyzed at 35 ply or so, the engines properly understand them. So they cease to be a problem.

15 years ago, Evans gambit positions would withstand 48 hours of analysis and the engines would still cough up stupidity. Today, the engines choose the right moves after an hour.
That's the problem. Except for correspondence chess, anyway. :)

And let a good GM play the evans against a computer at blitz, and you see why the book is important... And not just the Evans either.
This whole thread is about improving the book, not playing without it and there is no need to convince anyone here about the importance of the book, that is already understood.

My feeling on book learning via mini-max is that it's a good thing. But it is important that it be done correctly, otherwise there are serious problems. And it has to involve a hybrid approach - the computer doesn't just build it's own book but it has to be supplemented with moves the computer would never consider playing on it's own, such as 4. b4 in the Evans gambit. In order to understand any position and have a reasonable score attached to it, there cannot be holes in the book. So whenever the computer is surprised with an unexpected move it must be inserted in the book. Dann Corbit's suggestion of keeping PV nodes I believe is a good one and it will do the job very nicely.

You will not be able to beat the program over and over again with the same gambit using this approach and I think it's the best you can do. Of course you can apply rote learning and just tell the computer what move to play, but I would rather the computer UNDERSTANDS what it is doing as opposed to just memorizing lines of play.

To be sure this is not perfect. We are leveraging the fact that modern program play like super-grandmasters and even if they do not understand a position, they will usually play correctly. Komodo doesn't know the Evans Gambit, but if I walk it through, it almost always plays a popular line. That does not mean it understands the opening, but you must be capable of very strong play in order to expand opening theory. A few years ago this was not yet feasible for computers but I think the time has come.

The biggest problem was layed out by the late Tony Scherzer many years ago in a paper he wrote on learning. It works great in many positions but poorly in others. The problem that can happen is that the computer does not understand position P and FIRST has to learn 10 ways to lose before it finally gets the point. During each pass through the learning process something is leaned but it just causes the program to play a different losing move! Eventually however enough backtracking happens that the position is understood and the right move is played.

But keep in mind that these experiments were done when computers were 1000 ELO weaker than today and memory was scarce. We now have the resources to do this so much better. And the computer will converge on the right understand much more quickly these days. Scherzer showed that the approach worked and that the program would eventually converge on the correct move, but it was a question of how long it would take.
Rein Halbersma
Posts: 741
Joined: Tue May 22, 2007 11:13 am

Re: Book learning

Post by Rein Halbersma »

In checkers/draughts, people very successfully use dropout-expansion, a book-building technique developed a decade ago by Thomas Lincke, see chapter 3 of:
http://e-collection.library.ethz.ch/ese ... 905-02.pdf

Martin Fierz has written an old post at CCC about his experiences with dropout-expansion in checkers and chess:
http://www.stmintz.com/ccc/index.php?id=201669

A decade ago, the main problem for chess seemed the proliferation of slightly weaker but irrelevant moves. It would be interesting to see what modern engines think of the examples he gave.
User avatar
marcelk
Posts: 348
Joined: Sat Feb 27, 2010 12:21 am

Re: Book learning

Post by marcelk »

Rein Halbersma wrote:In checkers/draughts, people very successfully use dropout-expansion, a book-building technique developed a decade ago by Thomas Lincke, see chapter 3 of:
http://e-collection.library.ethz.ch/ese ... 905-02.pdf

Martin Fierz has written an old post at CCC about his experiences with dropout-expansion in checkers and chess:
http://www.stmintz.com/ccc/index.php?id=201669

A decade ago, the main problem for chess seemed the proliferation of slightly weaker but irrelevant moves. It would be interesting to see what modern engines think of the examples he gave.
I don't see how drop-out expansion would have problems to recognize Rxc3 without too much effort. For shallow searches it is the second move slightly behind h5. For deep searches it is the best.
Rein Halbersma
Posts: 741
Joined: Tue May 22, 2007 11:13 am

Re: Book learning

Post by Rein Halbersma »

marcelk wrote:
Rein Halbersma wrote:In checkers/draughts, people very successfully use dropout-expansion, a book-building technique developed a decade ago by Thomas Lincke, see chapter 3 of:
http://e-collection.library.ethz.ch/ese ... 905-02.pdf

Martin Fierz has written an old post at CCC about his experiences with dropout-expansion in checkers and chess:
http://www.stmintz.com/ccc/index.php?id=201669

A decade ago, the main problem for chess seemed the proliferation of slightly weaker but irrelevant moves. It would be interesting to see what modern engines think of the examples he gave.
I don't see how drop-out expansion would have problems to recognize Rxc3 without too much effort. For shallow searches it is the second move slightly behind h5. For deep searches it is the best.
If that's the case for modern engines, then you can automate the build of a very large and very high quality opening book. For e.g. if the book quality should reflect 1 minute searches on each position, then with an 8-core machine one could build a 4 million node book within a year of computing.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Book learning

Post by bob »

Don wrote:
bob wrote:
Dann Corbit wrote:
Don wrote:
bob wrote:
Don wrote:
Dann Corbit wrote:
hgm wrote:Well, the example is very extreme, of course. But it could very well hapen that a variation that was believed to be good for the opponent was played a lot, and that the common defense against it was move A, which thus got many plays, but rather poor score. Then a novelty B was found in this position, which basically refuted the previous move. It was played a few times, much less than A, but with a very good score, which convinced people that it indeed was a refutation. So they stopped playing the previous move, and B never got the chance to get anywhere near the number of plays that A has. This makes scenarious where A has 40% out of 1000 plays while B has 60% out of 100 plays not unrealistic. Polyglot would continue to prefer the poor move by 400:60. And even worse, it would not discourage the preceeding move from being played, as all the 600 points scored against it because it was met by the inferior reply A continue to count, while they are in fact no longer relevant, as good players will meet it with move B for sure.

Now I know of course that serious book builders would select the games from which they build the book by date, pay attention to the ratings, etc., exactly to cleanse the input data from such misleading cues. But this is a case that could very well be handled automatically, by proper application of minimax. If the 40% vs 60% scores for move A and B would lead to fafor B over A in play by 90% vs 10%, the average score of this node should be set to 58% (90% x 60% + 10% x 40%), so 42% for the opponent's preceding move, based on 1/(0.81/100 + 0.01/1000) = 122 'virtual' games, rather than at 640 out of 1100 games (58%).
Why not mini-max the book with alpha-beta upon update? Doesn't that remove the problems of refutations being discovered converging slowly?
I think mini-max of the book is a very good way to customize a book for a given program. In general you want your program to come out of book with a position that it thinks is good. A significant percentage of Komodo's losses in tournament games happens when we come out of book with what Komodo considers a losing (or close to losing) position.

However I believe that for this to work the program being tuned must search every book position to ensure there is a valid choice (at least from the programs point of view.) For example:

In position P the book gives Ne5 and d4, but both moves are inferior to c3 which is not in the book. In this case, any mini-max that involves position P will be in error. It might be a good line if c3 is allowed and thus the computer will avoid it when playing white, or as black think it's a great position to force white into.

So my idea for mini-maxing the book is to evaluate every position and insert the computers choice, if different, into the book also. In the cases where the computer has it's own TN you might also consider verifying the new choice with additional thinking time to avoid gratuitously adding moves to the book (that might also get recursively expanded upon.)

I have never seen a book that didn't have some kind of omission such as this. Cilkchess lost a game to "King" one year because the book did not consider an obvious choice. In general I have not payed a lot of attention to the book in our programs and it has cost us dearly. There are a few programs that do not test very well or rank very high on the rating lists but get much better results in tournaments because of exceptionally good book preparation. Once you get a terrible position it's difficult to recover even against a program 2 or 3 hundred ELO weaker.
There are ways to handle some of that, but I don't think minimaxing is the solution. We did that in Cray Blitz. And in early Crafty versions. The problem is you will _always_ accept gambits since that move wins material and the rest do not.

You can do this minimaxing at least two different ways. The simplest is to simply follow each book line to the end, perhaps do a search there, and then minimax the scores back to the root. Only catch is that you have to change things for black, since when you play black you now want the lowest score, not the highest, if you use the same book for both sides. The other is to search down the book lines, but do a real search at each node as well to see if there is a better move than the book alternative(s).

We finally ended up with a manual system in Cray Blitz. We did the search, but only printed things out when we "thought" we had a better move than the book move, or when the search said that a book move flagged as good really appeared to be bad. That was a pain...
Yes, there is not perfect solution.

I'm not so sure the gambit stuff is a problem - it was a few years ago but programs have come a long way. Just for fun I tested the smith morra gambit with Komodo immediate after the pawn is accepted by black. I have to say that I agree completely with the assessment:

after 1.e4 c5 2. d4 cxd4 3. c3 dxc3

At 5 ply and beyond the score is within about 1/5 pawn of even. On earlier depths Komodo even thinks white has an advantage, but at 13 ply and beyond black is given a small advantage (which in my opinion is correct.) Most gambits are at least slightly unsound against a strong player with a cool head.

At 19 ply Komodo thinks black has an 18/100 pawn advantage.

Houdini at similar depths has the score very close to zero. We could argue about which assessment is more accurate but the point is that neither program is very concerned about being down a pawn.

With mini-max, we are not really going to evaluate THAT exact position either, we are going to see a tree of possibilities and then analyze THEM, which is likely to get us much closer to the truth about whether to offer the pawn or not.

I think your real point is that programs do not always evaluation positions evenly, they all over-value some position and under-value others. I agree with that.

However I believe the real point of minimax is to get positions that YOUR program likes. It's objective goodness is usually a matter of debate anyway but my working theory right now is that if you can come out of book with positions that your program likes, it's hard to do better than that because it means the pieces are more in harmony with your programs way of thinking.

Don't you just hate it when you come out of book and the program moves the pieces back to squares it prefers?
When the gambit positions are analyzed at 35 ply or so, the engines properly understand them. So they cease to be a problem.

15 years ago, Evans gambit positions would withstand 48 hours of analysis and the engines would still cough up stupidity. Today, the engines choose the right moves after an hour.
That's the problem. Except for correspondence chess, anyway. :)

And let a good GM play the evans against a computer at blitz, and you see why the book is important... And not just the Evans either.
This whole thread is about improving the book, not playing without it and there is no need to convince anyone here about the importance of the book, that is already understood.
Don't follow your comment. I said nothing about playing without a book. I am talking about trying to let a _computer_ decide which book moves are good or bad. There are positions where a computer will simply blunder to a deep but well-known trap. Those are the critical things, and books cover them. But letting the computer do searches from those positions is more than a little problematic.


My feeling on book learning via mini-max is that it's a good thing. But it is important that it be done correctly, otherwise there are serious problems. And it has to involve a hybrid approach - the computer doesn't just build it's own book but it has to be supplemented with moves the computer would never consider playing on it's own, such as 4. b4 in the Evans gambit. In order to understand any position and have a reasonable score attached to it, there cannot be holes in the book. So whenever the computer is surprised with an unexpected move it must be inserted in the book. Dann Corbit's suggestion of keeping PV nodes I believe is a good one and it will do the job very nicely.
Looks _very_ problematic to me. You know what you played. How does that help you play that opening well? If you won, did you win by playing a really good move, or by playing a really bad opponent? Ditto for losing. One good example was the deep analysis Berliner did years ago on his (I think) Fritz attack correspondence game(s). Very deep lines with only one right way and lots of wrong ways for black to reply. I think computers will die there without a good book. And if they get to sort of "vote" on what to play, they fall right back into the same problem...





You will not be able to beat the program over and over again with the same gambit using this approach and I think it's the best you can do. Of course you can apply rote learning and just tell the computer what move to play, but I would rather the computer UNDERSTANDS what it is doing as opposed to just memorizing lines of play.
Just by using the Mchess book learning idea? The problem is that your statement was not quite "complete." You will not be able to beat the program over and over again, until it tries every legal move and stumbles into the correct one. And if there is a similar problem at the next move, you just made this an N*N game problem where you lose (N-1)^2 games first. Too painful, and too slow. And in some of the "fun" openings, that might become (N-1)^x where x is _large_.




To be sure this is not perfect. We are leveraging the fact that modern program play like super-grandmasters and even if they do not understand a position, they will usually play correctly. Komodo doesn't know the Evans Gambit, but if I walk it through, it almost always plays a popular line. That does not mean it understands the opening, but you must be capable of very strong play in order to expand opening theory. A few years ago this was not yet feasible for computers but I think the time has come.
Another good test is the Najdorf as white or black. Some of those attacks are 20-25 moves deep to where you begin to see that your move way back was flawed. There are still _plenty_ of openings that computers don't do very well at.

The biggest problem was layed out by the late Tony Scherzer many years ago in a paper he wrote on learning. It works great in many positions but poorly in others. The problem that can happen is that the computer does not understand position P and FIRST has to learn 10 ways to lose before it finally gets the point. During each pass through the learning process something is leaned but it just causes the program to play a different losing move! Eventually however enough backtracking happens that the position is understood and the right move is played.
It is worse than that. The fact that moving a single pawn, which does not change the dynamics on the board at all, still invalidates everything that has been learned. That is a problem.

But keep in mind that these experiments were done when computers were 1000 ELO weaker than today and memory was scarce. We now have the resources to do this so much better. And the computer will converge on the right understand much more quickly these days. Scherzer showed that the approach worked and that the program would eventually converge on the correct move, but it was a question of how long it would take.
A rat eventually finds his way out of a maze, but after a _bunch_ of wrong turns. Each of which represents a lost game. Humans don't learn like that, thankfully...
Dann Corbit
Posts: 12541
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: Book learning

Post by Dann Corbit »

bob wrote:
Don wrote:
bob wrote:
Dann Corbit wrote:
Don wrote:
bob wrote:
Don wrote:
Dann Corbit wrote:
hgm wrote:Well, the example is very extreme, of course. But it could very well hapen that a variation that was believed to be good for the opponent was played a lot, and that the common defense against it was move A, which thus got many plays, but rather poor score. Then a novelty B was found in this position, which basically refuted the previous move. It was played a few times, much less than A, but with a very good score, which convinced people that it indeed was a refutation. So they stopped playing the previous move, and B never got the chance to get anywhere near the number of plays that A has. This makes scenarious where A has 40% out of 1000 plays while B has 60% out of 100 plays not unrealistic. Polyglot would continue to prefer the poor move by 400:60. And even worse, it would not discourage the preceeding move from being played, as all the 600 points scored against it because it was met by the inferior reply A continue to count, while they are in fact no longer relevant, as good players will meet it with move B for sure.

Now I know of course that serious book builders would select the games from which they build the book by date, pay attention to the ratings, etc., exactly to cleanse the input data from such misleading cues. But this is a case that could very well be handled automatically, by proper application of minimax. If the 40% vs 60% scores for move A and B would lead to fafor B over A in play by 90% vs 10%, the average score of this node should be set to 58% (90% x 60% + 10% x 40%), so 42% for the opponent's preceding move, based on 1/(0.81/100 + 0.01/1000) = 122 'virtual' games, rather than at 640 out of 1100 games (58%).
Why not mini-max the book with alpha-beta upon update? Doesn't that remove the problems of refutations being discovered converging slowly?
I think mini-max of the book is a very good way to customize a book for a given program. In general you want your program to come out of book with a position that it thinks is good. A significant percentage of Komodo's losses in tournament games happens when we come out of book with what Komodo considers a losing (or close to losing) position.

However I believe that for this to work the program being tuned must search every book position to ensure there is a valid choice (at least from the programs point of view.) For example:

In position P the book gives Ne5 and d4, but both moves are inferior to c3 which is not in the book. In this case, any mini-max that involves position P will be in error. It might be a good line if c3 is allowed and thus the computer will avoid it when playing white, or as black think it's a great position to force white into.

So my idea for mini-maxing the book is to evaluate every position and insert the computers choice, if different, into the book also. In the cases where the computer has it's own TN you might also consider verifying the new choice with additional thinking time to avoid gratuitously adding moves to the book (that might also get recursively expanded upon.)

I have never seen a book that didn't have some kind of omission such as this. Cilkchess lost a game to "King" one year because the book did not consider an obvious choice. In general I have not payed a lot of attention to the book in our programs and it has cost us dearly. There are a few programs that do not test very well or rank very high on the rating lists but get much better results in tournaments because of exceptionally good book preparation. Once you get a terrible position it's difficult to recover even against a program 2 or 3 hundred ELO weaker.
There are ways to handle some of that, but I don't think minimaxing is the solution. We did that in Cray Blitz. And in early Crafty versions. The problem is you will _always_ accept gambits since that move wins material and the rest do not.

You can do this minimaxing at least two different ways. The simplest is to simply follow each book line to the end, perhaps do a search there, and then minimax the scores back to the root. Only catch is that you have to change things for black, since when you play black you now want the lowest score, not the highest, if you use the same book for both sides. The other is to search down the book lines, but do a real search at each node as well to see if there is a better move than the book alternative(s).

We finally ended up with a manual system in Cray Blitz. We did the search, but only printed things out when we "thought" we had a better move than the book move, or when the search said that a book move flagged as good really appeared to be bad. That was a pain...
Yes, there is not perfect solution.

I'm not so sure the gambit stuff is a problem - it was a few years ago but programs have come a long way. Just for fun I tested the smith morra gambit with Komodo immediate after the pawn is accepted by black. I have to say that I agree completely with the assessment:

after 1.e4 c5 2. d4 cxd4 3. c3 dxc3

At 5 ply and beyond the score is within about 1/5 pawn of even. On earlier depths Komodo even thinks white has an advantage, but at 13 ply and beyond black is given a small advantage (which in my opinion is correct.) Most gambits are at least slightly unsound against a strong player with a cool head.

At 19 ply Komodo thinks black has an 18/100 pawn advantage.

Houdini at similar depths has the score very close to zero. We could argue about which assessment is more accurate but the point is that neither program is very concerned about being down a pawn.

With mini-max, we are not really going to evaluate THAT exact position either, we are going to see a tree of possibilities and then analyze THEM, which is likely to get us much closer to the truth about whether to offer the pawn or not.

I think your real point is that programs do not always evaluation positions evenly, they all over-value some position and under-value others. I agree with that.

However I believe the real point of minimax is to get positions that YOUR program likes. It's objective goodness is usually a matter of debate anyway but my working theory right now is that if you can come out of book with positions that your program likes, it's hard to do better than that because it means the pieces are more in harmony with your programs way of thinking.

Don't you just hate it when you come out of book and the program moves the pieces back to squares it prefers?
When the gambit positions are analyzed at 35 ply or so, the engines properly understand them. So they cease to be a problem.

15 years ago, Evans gambit positions would withstand 48 hours of analysis and the engines would still cough up stupidity. Today, the engines choose the right moves after an hour.
That's the problem. Except for correspondence chess, anyway. :)

And let a good GM play the evans against a computer at blitz, and you see why the book is important... And not just the Evans either.
This whole thread is about improving the book, not playing without it and there is no need to convince anyone here about the importance of the book, that is already understood.
Don't follow your comment. I said nothing about playing without a book. I am talking about trying to let a _computer_ decide which book moves are good or bad. There are positions where a computer will simply blunder to a deep but well-known trap. Those are the critical things, and books cover them. But letting the computer do searches from those positions is more than a little problematic.


My feeling on book learning via mini-max is that it's a good thing. But it is important that it be done correctly, otherwise there are serious problems. And it has to involve a hybrid approach - the computer doesn't just build it's own book but it has to be supplemented with moves the computer would never consider playing on it's own, such as 4. b4 in the Evans gambit. In order to understand any position and have a reasonable score attached to it, there cannot be holes in the book. So whenever the computer is surprised with an unexpected move it must be inserted in the book. Dann Corbit's suggestion of keeping PV nodes I believe is a good one and it will do the job very nicely.
Looks _very_ problematic to me. You know what you played. How does that help you play that opening well? If you won, did you win by playing a really good move, or by playing a really bad opponent? Ditto for losing. One good example was the deep analysis Berliner did years ago on his (I think) Fritz attack correspondence game(s). Very deep lines with only one right way and lots of wrong ways for black to reply. I think computers will die there without a good book. And if they get to sort of "vote" on what to play, they fall right back into the same problem...





You will not be able to beat the program over and over again with the same gambit using this approach and I think it's the best you can do. Of course you can apply rote learning and just tell the computer what move to play, but I would rather the computer UNDERSTANDS what it is doing as opposed to just memorizing lines of play.
Just by using the Mchess book learning idea? The problem is that your statement was not quite "complete." You will not be able to beat the program over and over again, until it tries every legal move and stumbles into the correct one. And if there is a similar problem at the next move, you just made this an N*N game problem where you lose (N-1)^2 games first. Too painful, and too slow. And in some of the "fun" openings, that might become (N-1)^x where x is _large_.




To be sure this is not perfect. We are leveraging the fact that modern program play like super-grandmasters and even if they do not understand a position, they will usually play correctly. Komodo doesn't know the Evans Gambit, but if I walk it through, it almost always plays a popular line. That does not mean it understands the opening, but you must be capable of very strong play in order to expand opening theory. A few years ago this was not yet feasible for computers but I think the time has come.
Another good test is the Najdorf as white or black. Some of those attacks are 20-25 moves deep to where you begin to see that your move way back was flawed. There are still _plenty_ of openings that computers don't do very well at.

The biggest problem was layed out by the late Tony Scherzer many years ago in a paper he wrote on learning. It works great in many positions but poorly in others. The problem that can happen is that the computer does not understand position P and FIRST has to learn 10 ways to lose before it finally gets the point. During each pass through the learning process something is leaned but it just causes the program to play a different losing move! Eventually however enough backtracking happens that the position is understood and the right move is played.
It is worse than that. The fact that moving a single pawn, which does not change the dynamics on the board at all, still invalidates everything that has been learned. That is a problem.

But keep in mind that these experiments were done when computers were 1000 ELO weaker than today and memory was scarce. We now have the resources to do this so much better. And the computer will converge on the right understand much more quickly these days. Scherzer showed that the approach worked and that the program would eventually converge on the correct move, but it was a question of how long it would take.
A rat eventually finds his way out of a maze, but after a _bunch_ of wrong turns. Each of which represents a lost game. Humans don't learn like that, thankfully...
I think that computer analysis is only one part of the chess analysis equation.

Here are the results for depth of analysis in my database:
{query is designed to filter out endgame and checkmates, and from the main table -- I have at least 5 times as many in the alternative analysis table}

Code: Select all


set rowcount 0;
select acd, count(*) as number from Epd 
where len&#40;rtrim&#40;Epd&#41;) > 45 and NOT ( acd is null or acd = 0&#41; and abs&#40;ce&#41; < 32000
group by acd

acd	Depth
------------ ----------
4	15
5	7
6	9
7	13
8	6
9	7
10	5
11	48
12	221
13	303
14	670
15	7263
16	95526
17	206884
18	233250
19	129489
20	47732
21	20028
22	10690
23	9193
24	8317
25	6963
26	6317
27	6286
28	5079
29	3421
30	1981
31	1164
32	710
33	470
34	289
35	174
36	106
37	82
38	42
39	15
40	13
41	8
42	11
43	5
44	2
45	10
46	2
48	4
49	1
50	1
51	2
52	1
53	1
57	1
58	1
60	7
63	2
75	1
76	1
82	1
83	1
100	1
Now, obviously the vast majority of these positions are not analyzed nearly deeply enough to discover gambits.
So we also carry wins/losses/draws for each position in actual high quality games that were played.
And we also carry what moves were statistically played.
If, for instance, our calculated best move does not match the most frequently played move, then that is a red flag.

Most of the time, the high quality user choice move matches:

Code: Select all

select count&#40;*) From Epd
where len&#40;rtrim&#40;Epd&#41;) > 45 and NOT ( acd is null or acd = 0&#41; and abs&#40;ce&#41; < 32000
AND &#40;substring&#40;pm,1,len&#40;rtrim&#40;bm&#41;)) = bm  OR c1 = 'Permutation')

Count
---------
644947
There are 157905 rows for which the above condition does not hold true, so about:
157905/802852 = .197
or 20% of the rows in the database have a position where the calculated answer does not match the most frequently played answer.

So now we have choices to make. We can examine the won/loss/draw statistic for both forward positions (the current position after the frequently made move and the current position after the analyzed move.) If there are at least 30 outcomes, we have a statistically significant sample and we can make an educated guess as to which is better based on that.

As we play more and more games, we update the statistics in our database. We can also spot trouble positions and either tag them as "don't play" or analyze them at very long time control to try to recitify them.

Of course there are many other alternatives.

The real key to using a statistical approach is to choose the right model. Is (for instance) extremely deep analysis better than a huge number of outcomes? If we have depth of analysis of at least 35 should we always choose this move? If we have at least 1000 games played, should we always choose the statistically most winning move? The real fun of it will be to perform a fitted curve with all of the available parameters in order to make the best possible choice.

I think (however) that the more data we have available about a position, and the more sophisticated our extrapolation algorithm is, the better our decisions can become.
User avatar
marcelk
Posts: 348
Joined: Sat Feb 27, 2010 12:21 am

Re: Book learning

Post by marcelk »

bob wrote:A rat eventually finds his way out of a maze, but after a _bunch_ of wrong turns. Each of which represents a lost game. Humans don't learn like that, thankfully...
One can do the learning offline. A game lost offline is no harm done. And if there is an escape, the human may not find it but the rat eventually will. I put my money on the rat nowadays.
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: Book learning

Post by Don »

bob wrote:
Don wrote:
bob wrote:
Dann Corbit wrote:
Don wrote:
bob wrote:
Don wrote:
Dann Corbit wrote:
hgm wrote:Well, the example is very extreme, of course. But it could very well hapen that a variation that was believed to be good for the opponent was played a lot, and that the common defense against it was move A, which thus got many plays, but rather poor score. Then a novelty B was found in this position, which basically refuted the previous move. It was played a few times, much less than A, but with a very good score, which convinced people that it indeed was a refutation. So they stopped playing the previous move, and B never got the chance to get anywhere near the number of plays that A has. This makes scenarious where A has 40% out of 1000 plays while B has 60% out of 100 plays not unrealistic. Polyglot would continue to prefer the poor move by 400:60. And even worse, it would not discourage the preceeding move from being played, as all the 600 points scored against it because it was met by the inferior reply A continue to count, while they are in fact no longer relevant, as good players will meet it with move B for sure.

Now I know of course that serious book builders would select the games from which they build the book by date, pay attention to the ratings, etc., exactly to cleanse the input data from such misleading cues. But this is a case that could very well be handled automatically, by proper application of minimax. If the 40% vs 60% scores for move A and B would lead to fafor B over A in play by 90% vs 10%, the average score of this node should be set to 58% (90% x 60% + 10% x 40%), so 42% for the opponent's preceding move, based on 1/(0.81/100 + 0.01/1000) = 122 'virtual' games, rather than at 640 out of 1100 games (58%).
Why not mini-max the book with alpha-beta upon update? Doesn't that remove the problems of refutations being discovered converging slowly?
I think mini-max of the book is a very good way to customize a book for a given program. In general you want your program to come out of book with a position that it thinks is good. A significant percentage of Komodo's losses in tournament games happens when we come out of book with what Komodo considers a losing (or close to losing) position.

However I believe that for this to work the program being tuned must search every book position to ensure there is a valid choice (at least from the programs point of view.) For example:

In position P the book gives Ne5 and d4, but both moves are inferior to c3 which is not in the book. In this case, any mini-max that involves position P will be in error. It might be a good line if c3 is allowed and thus the computer will avoid it when playing white, or as black think it's a great position to force white into.

So my idea for mini-maxing the book is to evaluate every position and insert the computers choice, if different, into the book also. In the cases where the computer has it's own TN you might also consider verifying the new choice with additional thinking time to avoid gratuitously adding moves to the book (that might also get recursively expanded upon.)

I have never seen a book that didn't have some kind of omission such as this. Cilkchess lost a game to "King" one year because the book did not consider an obvious choice. In general I have not payed a lot of attention to the book in our programs and it has cost us dearly. There are a few programs that do not test very well or rank very high on the rating lists but get much better results in tournaments because of exceptionally good book preparation. Once you get a terrible position it's difficult to recover even against a program 2 or 3 hundred ELO weaker.
There are ways to handle some of that, but I don't think minimaxing is the solution. We did that in Cray Blitz. And in early Crafty versions. The problem is you will _always_ accept gambits since that move wins material and the rest do not.

You can do this minimaxing at least two different ways. The simplest is to simply follow each book line to the end, perhaps do a search there, and then minimax the scores back to the root. Only catch is that you have to change things for black, since when you play black you now want the lowest score, not the highest, if you use the same book for both sides. The other is to search down the book lines, but do a real search at each node as well to see if there is a better move than the book alternative(s).

We finally ended up with a manual system in Cray Blitz. We did the search, but only printed things out when we "thought" we had a better move than the book move, or when the search said that a book move flagged as good really appeared to be bad. That was a pain...
Yes, there is not perfect solution.

I'm not so sure the gambit stuff is a problem - it was a few years ago but programs have come a long way. Just for fun I tested the smith morra gambit with Komodo immediate after the pawn is accepted by black. I have to say that I agree completely with the assessment:

after 1.e4 c5 2. d4 cxd4 3. c3 dxc3

At 5 ply and beyond the score is within about 1/5 pawn of even. On earlier depths Komodo even thinks white has an advantage, but at 13 ply and beyond black is given a small advantage (which in my opinion is correct.) Most gambits are at least slightly unsound against a strong player with a cool head.

At 19 ply Komodo thinks black has an 18/100 pawn advantage.

Houdini at similar depths has the score very close to zero. We could argue about which assessment is more accurate but the point is that neither program is very concerned about being down a pawn.

With mini-max, we are not really going to evaluate THAT exact position either, we are going to see a tree of possibilities and then analyze THEM, which is likely to get us much closer to the truth about whether to offer the pawn or not.

I think your real point is that programs do not always evaluation positions evenly, they all over-value some position and under-value others. I agree with that.

However I believe the real point of minimax is to get positions that YOUR program likes. It's objective goodness is usually a matter of debate anyway but my working theory right now is that if you can come out of book with positions that your program likes, it's hard to do better than that because it means the pieces are more in harmony with your programs way of thinking.

Don't you just hate it when you come out of book and the program moves the pieces back to squares it prefers?
When the gambit positions are analyzed at 35 ply or so, the engines properly understand them. So they cease to be a problem.

15 years ago, Evans gambit positions would withstand 48 hours of analysis and the engines would still cough up stupidity. Today, the engines choose the right moves after an hour.
That's the problem. Except for correspondence chess, anyway. :)

And let a good GM play the evans against a computer at blitz, and you see why the book is important... And not just the Evans either.
This whole thread is about improving the book, not playing without it and there is no need to convince anyone here about the importance of the book, that is already understood.
Don't follow your comment. I said nothing about playing without a book. I am talking about trying to let a _computer_ decide which book moves are good or bad. There are positions where a computer will simply blunder to a deep but well-known trap. Those are the critical things, and books cover them. But letting the computer do searches from those positions is more than a little problematic.


My feeling on book learning via mini-max is that it's a good thing. But it is important that it be done correctly, otherwise there are serious problems. And it has to involve a hybrid approach - the computer doesn't just build it's own book but it has to be supplemented with moves the computer would never consider playing on it's own, such as 4. b4 in the Evans gambit. In order to understand any position and have a reasonable score attached to it, there cannot be holes in the book. So whenever the computer is surprised with an unexpected move it must be inserted in the book. Dann Corbit's suggestion of keeping PV nodes I believe is a good one and it will do the job very nicely.
Looks _very_ problematic to me. You know what you played. How does that help you play that opening well? If you won, did you win by playing a really good move, or by playing a really bad opponent? Ditto for losing. One good example was the deep analysis Berliner did years ago on his (I think) Fritz attack correspondence game(s). Very deep lines with only one right way and lots of wrong ways for black to reply. I think computers will die there without a good book. And if they get to sort of "vote" on what to play, they fall right back into the same problem...





You will not be able to beat the program over and over again with the same gambit using this approach and I think it's the best you can do. Of course you can apply rote learning and just tell the computer what move to play, but I would rather the computer UNDERSTANDS what it is doing as opposed to just memorizing lines of play.
Just by using the Mchess book learning idea? The problem is that your statement was not quite "complete." You will not be able to beat the program over and over again, until it tries every legal move and stumbles into the correct one. And if there is a similar problem at the next move, you just made this an N*N game problem where you lose (N-1)^2 games first. Too painful, and too slow. And in some of the "fun" openings, that might become (N-1)^x where x is _large_.




To be sure this is not perfect. We are leveraging the fact that modern program play like super-grandmasters and even if they do not understand a position, they will usually play correctly. Komodo doesn't know the Evans Gambit, but if I walk it through, it almost always plays a popular line. That does not mean it understands the opening, but you must be capable of very strong play in order to expand opening theory. A few years ago this was not yet feasible for computers but I think the time has come.
Another good test is the Najdorf as white or black. Some of those attacks are 20-25 moves deep to where you begin to see that your move way back was flawed. There are still _plenty_ of openings that computers don't do very well at.

The biggest problem was layed out by the late Tony Scherzer many years ago in a paper he wrote on learning. It works great in many positions but poorly in others. The problem that can happen is that the computer does not understand position P and FIRST has to learn 10 ways to lose before it finally gets the point. During each pass through the learning process something is leaned but it just causes the program to play a different losing move! Eventually however enough backtracking happens that the position is understood and the right move is played.
It is worse than that. The fact that moving a single pawn, which does not change the dynamics on the board at all, still invalidates everything that has been learned. That is a problem.

But keep in mind that these experiments were done when computers were 1000 ELO weaker than today and memory was scarce. We now have the resources to do this so much better. And the computer will converge on the right understand much more quickly these days. Scherzer showed that the approach worked and that the program would eventually converge on the correct move, but it was a question of how long it would take.
A rat eventually finds his way out of a maze, but after a _bunch_ of wrong turns. Each of which represents a lost game. Humans don't learn like that, thankfully...
I agree with just about everything you are saying. However what I am looking for is a way to always come out of book with a position the program likes. I'm not stating this as a fact but I consider it a strong hypothesis - given two objectively equal positions a program should play the one it prefers.


I'm also not proposing a fully automated approach because I appreciate that it will still require a lot of attention by a book expert. For instance I realize that the program could conceivably come out of book with a dead lost position that the program is happy with. There will be cases where an automated approach cannot correct this with a reasonable amount of effort so it will require manual human intervention. I'm not denying that. But I see no reason that a hybrid approach cannot be best. If the computer can decide what to play and build the bulk of the book itself and I can over-ride it's choices whenever I want to, then the lower bound on it's value is the same as a book created solely by humans.