Creating a debugging tool for chess engine.

Colin · Post by **Colin** » Fri Sep 21, 2007 5:02 am

Hi, I've run across a nasty bug in my chess engine.
I entered into it a mate in 4 puzzle, but it can't do it, yet it has no problem with the other 100 puzzles of various mate depths in.

I can't seem to get to the bottom of it, I keep getting bogged down with printouts whilst trying to track the value 1,000,001 / -1,000,001 (checkmate) that never reached the surface(root) of the tree.

Is there a good way to keep track of whats going on, without printing out millions/billions of moves.

Thanks

bob · Post by **bob** » Fri Sep 21, 2007 5:08 am

Colin wrote:Hi, I've run across a nasty bug in my chess engine.
I entered into it a mate in 4 puzzle, but it can't do it, yet it has no problem with the other 100 puzzles of various mate depths in.

I can't seem to get to the bottom of it, I keep getting bogged down with printouts whilst trying to track the value 1,000,001 / -1,000,001 (checkmate) that never reached the surface(root) of the tree.

Is there a good way to keep track of whats going on, without printing out millions/billions of moves.

Thanks

What I often resort to is a "selective dump". Only dump the ply-1 move that you are interested in. At ply=2, only print moves if they match the key move or two that leads to a mate you are missing. If you use the node counter to decide when/what to print, you can reduce millions of lines to hundreds or thousands.

Some are _very_ difficult to track down. That's the problem with searching millions of moves per second. It seems that most of these kinds of problems take lots of seconds.

Another good trick is this. Suppose you are looking for a mate in 6. Rather than dumping all of that, play the first 2 moves for each side and now you just have a mate in 4 and that tree is very small. If you find the mate there, then you get to back up 2 moves and find the mate in 5. Etc...

There are lots of ways to make this idea work by reducing the size of the output...

hgm · Post by **hgm** » Fri Sep 21, 2007 11:04 am

The way I debug reproducible search errors is like this:

I have a global array path[MAXPLY], and my search routine keeps track of the current level in the tree, starting from the root (not to be confused with the remaining depth, which is affected by extensions and reductions and iterative deepening).

In MakeMove() I assign the move to path[level]. I can then at any time know where I am in the tree by looking at path[]. So it becomes easy to select single nodes for printing, by if statements like

if(level==3 && path[0]==MOVE1 && path[1]==MOVE2 && path[2]==MOVE3)

As I have print statements in various places, I usually abbreviate this as

if(PATH)

where I then #define PATH as whatever node I am interested in at the time. An alternative would be to do the compare in a function TestPath().

What I usually print, for every move searched from a node, is the score of the move, the best score so far, alpha and beta for every move, just after the recursive call to Search() returns. (Don't forget to print the iteration number, if you do IID.)

With that aid I simply start walking the tree, starting at the root. If I think the move it plays is a gross error, I do the printout in the root node, and I look to the score of the move played, and the score (usually an upper bound) of the move I think should have been played in stead. One of the two will have a score which I consider wrong. (E.g. it evaluates the move it played way to high, overlooking he is checkmated-in-2 in a 10-ply search.)

I then set the path to the node where the move with the suspect score leads to, and repeat the process. At any time the amount of data that is printed is very low (just one node), and usually you zoom in onto the error pretty quick.

Hash tables make the process a little more awkward, as some branches end in a hash hit. So if the node of your PATH gives a hash hit + pruning, it is definitely something you should print, as it is the only output you will get from that node. Apart from printing out the contents of the hash entry, (including signature), remaining search depth, alpha and beta, you also print the number of the hash entry. Because if you consider the score to be wrong, you want to examine the search that filled that hash entry.

You then have another print statement in your code, that does not print based on the PATH, but on hash-entry number, whenever there is a store to that entry. Print what is stored, but, more importantly, print the current path. This printout might occur multiple times, with different signatures. The last time it occurs before you hit the node where the hash pruning occurred will be the one responsible for the data retrieved for this hash pruning. You will then set the PATH to the node where this last hash fill occurred, and continue the tree walk there. You might want to print out the full position when you enter the node of the PATH as well, to make sure the hash was really filled by the same position as for which you probed it, to catch key collissions (or hash bugs).

It sounds cumbersome, but once you are used to it, it is really very fast. It beats looking through millions of moves from a tree dump, and can be applied to arbitrarily large trees.

When I am done debugging, I #define PATH and ENTRY as 0, which will cause the code optimizer to remove all print statements. So I can leave the print statements in the source code all the time.

Once you find a node where things obviously don't work as intended, you can follow the normal debugging method of printing out the value of critical variables in different places, to see where the get a wrong value, all in PATH-controlled print statements.

Alessandro Scotti · Post by **Alessandro Scotti** » Fri Sep 21, 2007 2:37 pm

Hi Colin,
assuming you have tried to disable null move and other kinds of pruning without solving the problem, you may want to use a tree browser. The tool "Chant" looks nice, but I didn't like the graphical interface and I need it to be multiplatform, so eventually I decided to write my own. If you want to take this road, I have posted my source files as a kickstart:

http://www.ascotti.org/programming/chess/zip/trace.zip

It's C++ code and you'll have to supply your own code for handling moves and positions. It works by dumping the search tree (or part of it) on disk, from where you can then browse it later.

Colin · Post by **Colin** » Sat Sep 22, 2007 2:22 am

Thanks for all the advice, it has been very beneficial.
My code is in java, so I don't think the trace.zip file will work.

Anyway, I've built some code similar to what H.G suggests..
And I've stored the required solution for it check..
I'm not using TTs yet or QS or null move.
I use IID, so this is depth 8 of the IID.

From the position, 1=pawn,2=knight,3=king,5=bish,6=rook,7=queen.

0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 5
0 0 0 0 0 0 0 0
0 0 0-1-1 0 0 0
0 3 0-3 0 0 2 0
0 0 0 0 0 1 0 0
0 0 0 0 0 0 0 0

At the beginning of each node, I run a test to see if the moves
played so far in the search upto the current node match the
required moves, so for the very first node it is blank.

The required solution to give checkmate is
Bc1,e3,f3,e2,Nh5,e1=Q,Nf4,

@@@@@@@ W D=8 node_count=1

@@@@@@@ B D=7 node_count=53174
Bc1

@@@@@@@ W D=6 node_count=53175
Bc1 e3

@@@@@@@ B D=5 node_count=53240
Bc1 e3 f3

@@@@@@@ W D=4 node_count=53241
Bc1 e3 f3 e2

Depth=8 BM=f3 score=642 leaf-count=32276 PV=f3 --> exf3 --> Bg5 --> f2 --> Bf4 --> f1=Q --> Nxf1 --> Ke2

So you can see its getting stuck after e2, and isn't ever playing Nh5 next.
Also the final score should be 1,000,001 indicating a checkmate.

I haven't solved it yet, but I'm a lot closer now, and using node_count means I can print out info inside a particular node.

Harald Johnsen · Post by **Harald Johnsen** » Sat Sep 22, 2007 11:32 am

I don't check the path but the position hash key

Code: Select all

        val = -Search&#40;pcon, pste + 1, - b, - a, succNType, nextExtCount + extension&#41;;

#ifdef _DEBUG
    if&#40; pste->hashkPc == 0x18c337f1342c ) &#123;
            char sz_score&#91;50&#93;;
            sprintf&#40;sz_score, "problem after here (%d&#41; ", val&#41;;
            VDumpLine&#40;pcon, pste+1, sz_score&#41;;
    &#125;
#endif

But you can not do that since you don't have TT yet.

Your dump show moves that match the mate solution, what you should do is to dump all the moves after Bc1 e3 f3 e2 and their score. You should finaly see the Nh5 move and the probably wrong evaluation after the move.
Or you could start at the end and dump all moves in the position after "Bc1,e3,f3,e2,Nh5,e1=Q" and look at the value of Nf4.

HJ.

hgm · Post by **hgm** » Sat Sep 22, 2007 12:01 pm

Colin wrote: [d]8/8/7B/8/3pp3/1K1k2N1/5P2/8 w

hgm · Post by **hgm** » Sat Sep 22, 2007 12:24 pm

Errors like this are almost always a consequence of faulty alpha-beta pruning.

So in the node that should play Nh5, you should print alpha and beta, and all the moves it does search, and their score. You will most likely see that it only searches part of the moves, and that before it gets to Nh5, it scores another move bove beta, causing a cutoff.

This means it thinks the branch is irrelevant, because black can avoid it by playing another move at an earlier level. This is of course wrong, as black has only one move in every position. So it must mean beta is set improperly, somewhere.

Colin · Post by **Colin** » Sat Sep 22, 2007 9:59 pm

Yes exactly...
In the white node where Nh5 should be played...
I looked at the generated moves, and it wasn't even generated.
The error is some where in W_MAKE_MOVE(), or W_UNDO_MOVE(), or B_MAKE_MOVE(), or B_UNDO_MOVE().

I have quite a tricky system for storing the pieces.
Essentially I have an array W[25] containing all the indexes of each white piece(W[0] is empty), and I have an array WPOS[64] giving the position in W, of each piece.
So if there is a white bishop on square 58(c1), and this bishop is in position 13 in W, then WPOS[58] is 13,
so if in W_MAKE_MOVE() F is 58, and T is 40, I just do..
W[WPOS[T]=WPOS[F]]=T and WPOS[F]=0

This might seem pointless since I have the bit values of each of the types of pieces, WP,WN,WB,WR,WQ.
But I figured it would be quicker to store the indexes as well as the bit values, so when I generate moves, I can just loop through indexes without having to extract the individual bits (and converting them to 0-63) of WN,WB.etc.
And when I call EVAL, I can also loop through the pieces indexes to get the material part of the evaluation more quickly.

I don't if this is a good idea, but extracting bits seems to be a little costly.
Is there a good solution to this problem?

Thanks

hgm · Post by **hgm** » Sat Sep 22, 2007 11:11 pm

Well, this means that you are using a hybrid representation (bitboard + mailbox], as your array WPOS is simply an 8x8 mailbox board, and your array W is a piece list.

There are more hybrid engines, so there is nothing against this. But of course it has to be implemented correctly, as bugs are never a good idea.

Write a routine to check the consistency between your mailbox-board, piece list and and bitboards, which screams if there is something wrong, and call it after every Make() and UnMake(). Then you find out quickly enough what goes wrong and where.

And try a perft to make sure that your move-gen, make and unmake operate correctly.

Creating a debugging tool for chess engine.

Creating a debugging tool for chess engine.

Re: Creating a debugging tool for chess engine.

Re: Creating a debugging tool for chess engine.

Re: Creating a debugging tool for chess engine.

Re: Creating a debugging tool for chess engine.

Re: Creating a debugging tool for chess engine.

Re: Creating a debugging tool for chess engine.

Re: Creating a debugging tool for chess engine.

Re: Creating a debugging tool for chess engine.

Re: Creating a debugging tool for chess engine.