Move encoding

rreagan · Post by **rreagan** » Mon Apr 21, 2014 7:54 pm

Which approach do you prefer for move encoding? I assume that what's important for a mature engine will not be obvious early on when you're still at the movegen/make/undo stage.

It seems a relatively minimal packed-int is most common, leaving most of the details to be computed on the fly (is this a capture/castle/pawn-push/etc?).

Much of that info is available at move generation time, so to the novice it seems obvious that you should capture that to avoid recomputing it later, but is it better to keep a compact representation?

mar · Post by **mar** » Mon Apr 21, 2014 8:05 pm

I guess it doesn't matter, just stay away from C bitfields which are very slow

ZirconiumX · Post by **ZirconiumX** » Mon Apr 21, 2014 9:16 pm

rreagan wrote:Which approach do you prefer for move encoding? I assume that what's important for a mature engine will not be obvious early on when you're still at the movegen/make/undo stage.

It seems a relatively minimal packed-int is most common, leaving most of the details to be computed on the fly (is this a capture/castle/pawn-push/etc?).

Much of that info is available at move generation time, so to the novice it seems obvious that you should capture that to avoid recomputing it later, but is it better to keep a compact representation?

For speed purposes it is a good practice to defer things as late as possible.

At best there is no difference between computing it after generation or during generation, at worst it is a significant slowdown, as you compute information that is useless in a CUT node, for instance.

Matthew:out

hgm · Post by **hgm** » Mon Apr 21, 2014 9:26 pm

It doesn't take any more time to read the moved or captured piece from the move, than to read it from the board when you have the from and to-square number in a register anyway.

I usually prefer an encoding where I use off-board to-squares to flag special moves (but I use 0x88-type boards; if your square numbers are contiguous, this turns into just having a single extra bit to flag all special moves (castling, e.p., promotion, double-pushes), and when the bit is set use the to-square to encode what really should be done (e.g. by using it as an index in a table that supplies the real to-square (usually encoded as move step) plus extra information not needed in normal moves, such as promotion piece, from- and to-square of the Rook, the range of squares to test for occupancy or being under attack, the squares to test for Pawn presents for setting the e.p. flag, the e.p. victim square, etc.).

Bloodbane · Post by **Bloodbane** » Mon Apr 21, 2014 10:01 pm

Most moves generated are never made. Therefore it is best to only encode the minimum amount of information, i.e. from, to and promotion(which can be used to flag en passant and castling as well) since this saves time. Those three items fit nicely into 16 bits. Also, the less space you use to encode a move the less space it takes on a TT entry.

lucasart · Post by **lucasart** » Tue Apr 22, 2014 1:13 am

rreagan wrote:Which approach do you prefer for move encoding? I assume that what's important for a mature engine will not be obvious early on when you're still at the movegen/make/undo stage.

It seems a relatively minimal packed-int is most common, leaving most of the details to be computed on the fly (is this a capture/castle/pawn-push/etc?).

Much of that info is available at move generation time, so to the novice it seems obvious that you should capture that to avoid recomputing it later, but is it better to keep a compact representation?

These days I prefer:
* from square: 6-bits
* to square: 6-bits
* promotion: 4 bits (Knight=0...Queen=3)

You don't need flags for en passant, castling, or promotion:
* move is promotion is from square has a pawn on it, to square is on the (relative) eighth rank
* move is en passant if to square is en passant square, and from square has a pawn on it
* move is castling if king captures his own rook (handles chess and chess960 in a unified way).

Many engines use flags (including DiscoCheck), but in the end, I'm not sure there's any speed gain from doing it. The problem with flags is that the data model becomes redundant. This creates possibilities for subtle bugs where the flag is not consistent with the (fsq, tsq, prom) data members. In a model that has no redundancy, this kind of bug simply cannot exist.

As Martin said, you need to do the bit operations yourself. Do not trust the compiler in turning your bit-fields into efficient code. Compilers suck really really hard at handling bit-fields. I initially used compiler managed bit-fields, but realized that the slow down is just horrendous. That was with GCC 4.6 I think, maybe your mileage will differ, depending on which compiler you use.

In fact, what I would even recommend is to do encoding and decoding operations at once, rather than all over the code, to get the best performance. Example:

Code: Select all

struct FatMove &#123;
  int fsq, tsq;
  int prom;
&#125;;

void foo&#40;move_t m&#41;
&#123;
  struct FatMove fm;
  decode_move&#40;m, &fm&#41;;
  // now use only fm.fsq, fm.tsq, and fm.prom
&#125;

move_t bar&#40;...)
&#123;
  struct FatMove fm;
  // do stuff and set fm.fsq, fm.tsq, fm.prom
  return encode_move&#40;&fm&#41;;
&#125;

rreagan · Post by **rreagan** » Tue Apr 22, 2014 1:52 am

lucasart wrote: * move is castling if king captures his own rook (handles chess and chess960 in a unified way).

I don't follow. Can you explain? How does a king moving from e1 to g1 capture a white rook on h1 or f1?

The problem with flags is that the data model becomes redundant. This creates possibilities for subtle bugs where the flag is not consistent with the (fsq, tsq, prom) data members. In a model that has no redundancy, this kind of bug simply cannot exist.

That's a great point. Single point of truth.

kbhearn · Post by **kbhearn** » Tue Apr 22, 2014 3:05 am

rreagan wrote:
lucasart wrote: * move is castling if king captures his own rook (handles chess and chess960 in a unified way).
I don't follow. Can you explain? How does a king moving from e1 to g1 capture a white rook on h1 or f1?

The problem with flags is that the data model becomes redundant. This creates possibilities for subtle bugs where the flag is not consistent with the (fsq, tsq, prom) data members. In a model that has no redundancy, this kind of bug simply cannot exist.
That's a great point. Single point of truth.

castling in the normal position becomes e1h1 or e1a1, to generalise to FR castling, if file(king)>file(rook) -> queenside, king ends c, rook ends d. else king ends g, rook ends f.

michiguel · Post by **michiguel** » Tue Apr 22, 2014 4:00 am

rreagan wrote:
lucasart wrote: * move is castling if king captures his own rook (handles chess and chess960 in a unified way).
I don't follow. Can you explain? How does a king moving from e1 to g1 capture a white rook on h1 or f1?

The problem with flags is that the data model becomes redundant. This creates possibilities for subtle bugs where the flag is not consistent with the (fsq, tsq, prom) data members. In a model that has no redundancy, this kind of bug simply cannot exist.
That's a great point. Single point of truth.

I like very much the opposite: to have a move with all the information. It allows me to have move-processing functions that would not depend on the board or anything else. That allows a better encapsulation, and better sanity checks if the idea is to avoid bugs. It is also more flexible because it is less dependent, and improves chances of modularity. I do not think there is one "truth" in this.

All the decoding is done through macros (from = MV_FROM(move); etc.)

Miguel

Steve Maughan · Post by **Steve Maughan** » Tue Apr 22, 2014 4:21 am

For Maverick I have a move directory for all possible moves. So my move encoding is a simple pointer.

Steve

Move encoding

Move encoding

Re: Move encoding

Re: Move encoding

Re: Move encoding

Re: Move encoding

Re: Move encoding

Re: Move encoding

Re: Move encoding

Re: Move encoding

Re: Move encoding