I have eliminated ALL if statements from the move generator :)

Discussion of chess software programming and technical issues.

Moderator: Ras

Mike Sherwin
Posts: 965
Joined: Fri Aug 21, 2020 1:25 am
Location: Planet Earth, Sol system
Full name: Michael J Sherwin

Re: I have eliminated ALL if statements from the move generator :)

Post by Mike Sherwin »

I also eliminated the if statements in the AttackedByWhite() and AttackedByBlack() functions. The disadvantage would be no early out if attacked. However, if not attacked then there are no early outs anyway. I think most of the time there are no attacks. It returns 0 if there are no attacks or instead the number of attacks. And the number of attacks could be useful. Anyone have any thoughts about the tradeoffs?

Code: Select all

s32 AttackedByWhite(Thread* t, u64 bb) {
  s32 n = 0;
  u64 occ = piece[BLACK] | piece[WHITE];

  do {
	s32 sq = std::countr_zero(bb);
	n = n
	  + (knightMoves[sq] & pctypbb[WN])
	  + (bPawnCapts[sq] & pctypbb[WP])
	  + (kingMoves[sq] & pctypbb[WK])
	  + ((ray[std::countr_zero(ray[sq].NW & occ)].NW
		| ray[std::countr_zero(ray[sq].NE & occ)].NE
		| ray[std::countl_zero(ray[sq].RSE & occ)].SE
		| ray[std::countl_zero(ray[sq].RSW & occ)].SW
		^ bishopMoves[sq])
		& (pctypbb[WB] | pctypbb[WQ]))
	  + ((ray[std::countr_zero(ray[sq].NN & occ)].NN
		| ray[std::countr_zero(ray[sq].EE & occ)].EE
		| ray[std::countl_zero(ray[sq].RSS & occ)].SS
		| ray[std::countl_zero(ray[sq].RWW & occ)].WW
		^ bishopMoves[sq])
		& (pctypbb[WR] | pctypbb[WQ]));
	bb ^= 1ull << sq;
  } while (bb);

  return n;
}
dangi12012
Posts: 1062
Joined: Tue Apr 28, 2020 10:03 pm
Full name: Daniel Infuehr

Re: I have eliminated ALL if statements from the move generator :)

Post by dangi12012 »

Mike Sherwin wrote: Sat Aug 20, 2022 5:51 am I also eliminated the if statements in the AttackedByWhite() and AttackedByBlack() functions. The disadvantage would be no early out if attacked. However, if not attacked then there are no early outs anyway. I think most of the time there are no attacks. It returns 0 if there are no attacks or instead the number of attacks. And the number of attacks could be useful. Anyone have any thoughts about the tradeoffs?

Code: Select all

s32 AttackedByWhite(Thread* t, u64 bb) {
  s32 n = 0;
  u64 occ = piece[BLACK] | piece[WHITE];

  do {
	s32 sq = std::countr_zero(bb);
	n = n
	  + (knightMoves[sq] & pctypbb[WN])
	  + (bPawnCapts[sq] & pctypbb[WP])
	  + (kingMoves[sq] & pctypbb[WK])
	  + ((ray[std::countr_zero(ray[sq].NW & occ)].NW
		| ray[std::countr_zero(ray[sq].NE & occ)].NE
		| ray[std::countl_zero(ray[sq].RSE & occ)].SE
		| ray[std::countl_zero(ray[sq].RSW & occ)].SW
		^ bishopMoves[sq])
		& (pctypbb[WB] | pctypbb[WQ]))
	  + ((ray[std::countr_zero(ray[sq].NN & occ)].NN
		| ray[std::countr_zero(ray[sq].EE & occ)].EE
		| ray[std::countl_zero(ray[sq].RSS & occ)].SS
		| ray[std::countl_zero(ray[sq].RWW & occ)].WW
		^ bishopMoves[sq])
		& (pctypbb[WR] | pctypbb[WQ]));
	bb ^= 1ull << sq;
  } while (bb);

  return n;
}
What is pctypbb? Enemy & Piece BB?
And how can you return a 32s integern when adding 64bit variables n + 64bit?
If you just want to know how often a square is attacked its would be quick to OR everything together and just std::popcount() in the end.

The tradeoff here is that the most efficient code is to not need this method like implemented above at all.

AttackedByWhite() from the kings perspective can become really efficient. (this should maybe even be the only invocation even)
You send out queen and knight and pawn ('rays') from the kings square and & with the appropriate enemy piece. Like above - without the loop and with OR and std::popcount() not with +..+..+

AttackedByWhite() for every other square can be become really effective by just remembering the attacked squares from the (own color) a ply earlier and incrementally adjusting for the only piece that moved. I once implemented this but it was a pain to maintin - because a piece that moves unblocks other pieces (but only sliders) - so you have to send out slider rays for any moved piece to have a maintained 'isAttackedBy' information all the time!
Worlds-fastest-Bitboard-Chess-Movegenerator
Daniel Inführ - Software Developer
Mike Sherwin
Posts: 965
Joined: Fri Aug 21, 2020 1:25 am
Location: Planet Earth, Sol system
Full name: Michael J Sherwin

Re: I have eliminated ALL if statements from the move generator :)

Post by Mike Sherwin »

dangi12012 wrote: Sat Aug 20, 2022 12:23 pm
Mike Sherwin wrote: Sat Aug 20, 2022 5:51 am I also eliminated the if statements in the AttackedByWhite() and AttackedByBlack() functions. The disadvantage would be no early out if attacked. However, if not attacked then there are no early outs anyway. I think most of the time there are no attacks. It returns 0 if there are no attacks or instead the number of attacks. And the number of attacks could be useful. Anyone have any thoughts about the tradeoffs?

Code: Select all

s32 AttackedByWhite(Thread* t, u64 bb) {
  s32 n = 0;
  u64 occ = piece[BLACK] | piece[WHITE];

  do {
	s32 sq = std::countr_zero(bb);
	n = n
	  + (knightMoves[sq] & pctypbb[WN])
	  + (bPawnCapts[sq] & pctypbb[WP])
	  + (kingMoves[sq] & pctypbb[WK])
	  + ((ray[std::countr_zero(ray[sq].NW & occ)].NW
		| ray[std::countr_zero(ray[sq].NE & occ)].NE
		| ray[std::countl_zero(ray[sq].RSE & occ)].SE
		| ray[std::countl_zero(ray[sq].RSW & occ)].SW
		^ bishopMoves[sq])
		& (pctypbb[WB] | pctypbb[WQ]))
	  + ((ray[std::countr_zero(ray[sq].NN & occ)].NN
		| ray[std::countr_zero(ray[sq].EE & occ)].EE
		| ray[std::countl_zero(ray[sq].RSS & occ)].SS
		| ray[std::countl_zero(ray[sq].RWW & occ)].WW
		^ bishopMoves[sq])
		& (pctypbb[WR] | pctypbb[WQ]));
	bb ^= 1ull << sq;
  } while (bb);

  return n;
}
What is pctypbb? Enemy & Piece BB?
And how can you return a 32s integern when adding 64bit variables n + 64bit?
If you just want to know how often a square is attacked its would be quick to OR everything together and just std::popcount() in the end.

The tradeoff here is that the most efficient code is to not need this method like implemented above at all.

AttackedByWhite() from the kings perspective can become really efficient. (this should maybe even be the only invocation even)
You send out queen and knight and pawn ('rays') from the kings square and & with the appropriate enemy piece. Like above - without the loop and with OR and std::popcount() not with +..+..+

AttackedByWhite() for every other square can be become really effective by just remembering the attacked squares from the (own color) a ply earlier and incrementally adjusting for the only piece that moved. I once implemented this but it was a pain to maintin - because a piece that moves unblocks other pieces (but only sliders) - so you have to send out slider rays for any moved piece to have a maintained 'isAttackedBy' information all the time!
Sorry, I copied the wrong code that did not have != 0. Here is the more correct code.

Code: Select all

s32 AttackedByWhite(Thread* t, u64 bb) {

  s32 n = 0;
  u64 occ = piece[BLACK] | piece[WHITE];

  do {
	s32 sq = std::countr_zero(bb);
	n = n
	  + (knightMoves[sq] & pctypbb[WN] != 0)
	  + (bPawnCapts[sq] & pctypbb[WP] != 0)
	  + (kingMoves[sq] & pctypbb[WK] != 0)
	  + ((ray[std::countr_zero(ray[sq].NW & occ)].NW
		| ray[std::countr_zero(ray[sq].NE & occ)].NE
		| ray[std::countl_zero(ray[sq].RSE & occ)].SE
		| ray[std::countl_zero(ray[sq].RSW & occ)].SW
		^ bishopMoves[sq])
		& (pctypbb[WB] | pctypbb[WQ]) != 0)
	  + ((ray[std::countr_zero(ray[sq].NN & occ)].NN
		| ray[std::countr_zero(ray[sq].EE & occ)].EE
		| ray[std::countl_zero(ray[sq].RSS & occ)].SS
		| ray[std::countl_zero(ray[sq].RWW & occ)].WW
		^ bishopMoves[sq])
		& (pctypbb[WR] | pctypbb[WQ]) != 0);
	bb ^= 1ull << sq;
  } while (bb);

  return n;
}
What is pctypbb?
piece type BB array, so pctypbb[WP] are all the white pawns.

Enemy & Piece BB?
piece[BLACK] is a bitboard of all black pieces
piece[WHITE] is a bitboard of all white pieces

And how can you return a 32s integern when adding 64bit variables n + 64bit?
like I said above, I copied the wrong code that did not have the != 0 requirement.

If you just want to know how often a square is attacked its would be quick to OR everything together and just std::popcount() in the end.
Yes, that makes sense. I'll probably change it.

I know that I do not need the loop. But in this case multiple squares can be sent in a bb for castling or to count attacks to the king's shelter, etc.

The tradeoff that I was most interested in was early out vs no early out. So like anything else I do it takes a lot of reworking. How would you write this function? But the main point I was making is that even the AttackedBy() or InCheck() function can be written branchless!
Mike Sherwin
Posts: 965
Joined: Fri Aug 21, 2020 1:25 am
Location: Planet Earth, Sol system
Full name: Michael J Sherwin

Re: I have eliminated ALL if statements from the move generator :)

Post by Mike Sherwin »

Code: Select all

s32 AttackedByWhite(Thread* t, u64 bb) {
  s32 n = 0;
  u64 bbn;
  u64 occ = piece[BLACK] | piece[WHITE];

  do {
	s32 sq = std::countr_zero(bb);
	bbn = (knightMoves[sq] & pctypbb[WN])
	    | (bPawnCapts[sq] & pctypbb[WP])
	    | (kingMoves[sq] & pctypbb[WK])
	    | ((ray[std::countr_zero(ray[sq].NW & occ)].NW
		  | ray[std::countr_zero(ray[sq].NE & occ)].NE
		  | ray[std::countl_zero(ray[sq].RSE & occ)].SE
		  | ray[std::countl_zero(ray[sq].RSW & occ)].SW
		  ^ bishopMoves[sq])
		  & (pctypbb[WB] | pctypbb[WQ]))
	    | ((ray[std::countr_zero(ray[sq].NN & occ)].NN
		  | ray[std::countr_zero(ray[sq].EE & occ)].EE
		  | ray[std::countl_zero(ray[sq].RSS & occ)].SS
		  | ray[std::countl_zero(ray[sq].RWW & occ)].WW
		  ^ bishopMoves[sq])
		  & (pctypbb[WR] | pctypbb[WQ]));
	n += std::popcount(bbn);
	bb ^= 1ull << sq;
  } while (bb);

  return n;
}
Hopefully this is correct and a little better?
Mike Sherwin
Posts: 965
Joined: Fri Aug 21, 2020 1:25 am
Location: Planet Earth, Sol system
Full name: Michael J Sherwin

Re: I have eliminated ALL if statements from the move generator :)

Post by Mike Sherwin »

I have finally been able to start debugging my new code. All it can do is accept moves and verify the move against moves from the move generator. CEP, e1g1 and promoting have been verified. I have only done enough at this moment to validate proof of concept. This is the run verifying castling.
e2e4
r n b q k b n r
p p p p p p p p
. . . . . . . .
. . . . . . . .
. . . . P . . .
. . . . . . . .
P P P P . P P P
R N B Q K B N R

e7e5
r n b q k b n r
p p p p . p p p
. . . . . . . .
. . . . p . . .
. . . . P . . .
. . . . . . . .
P P P P . P P P
R N B Q K B N R

g1f3
r n b q k b n r
p p p p . p p p
. . . . . . . .
. . . . p . . .
. . . . P . . .
. . . . . N . .
P P P P . P P P
R N B Q K B . R

b8c6
r . b q k b n r
p p p p . p p p
. . n . . . . .
. . . . p . . .
. . . . P . . .
. . . . . N . .
P P P P . P P P
R N B Q K B . R

f1c4
r . b q k b n r
p p p p . p p p
. . n . . . . .
. . . . p . . .
. . B . P . . .
. . . . . N . .
P P P P . P P P
R N B Q K . . R

f8c5
r . b q k . n r
p p p p . p p p
. . n . . . . .
. . b . p . . .
. . B . P . . .
. . . . . N . .
P P P P . P P P
R N B Q K . . R

e1g1
r . b q k . n r
p p p p . p p p
. . n . . . . .
. . b . p . . .
. . B . P . . .
. . . . . N . .
P P P P . P P P
R N B Q . R K .

undo
r . b q k . n r
p p p p . p p p
. . n . . . . .
. . b . p . . .
. . B . P . . .
. . . . . N . .
P P P P . P P P
R N B Q K . . R

undo
r . b q k b n r
p p p p . p p p
. . n . . . . .
. . . . p . . .
. . B . P . . .
. . . . . N . .
P P P P . P P P
R N B Q K . . R

undo
r . b q k b n r
p p p p . p p p
. . n . . . . .
. . . . p . . .
. . . . P . . .
. . . . . N . .
P P P P . P P P
R N B Q K B . R

undo
r n b q k b n r
p p p p . p p p
. . . . . . . .
. . . . p . . .
. . . . P . . .
. . . . . N . .
P P P P . P P P
R N B Q K B . R

undo
r n b q k b n r
p p p p . p p p
. . . . . . . .
. . . . p . . .
. . . . P . . .
. . . . . . . .
P P P P . P P P
R N B Q K B N R

undo
r n b q k b n r
p p p p p p p p
. . . . . . . .
. . . . . . . .
. . . . P . . .
. . . . . . . .
P P P P . P P P
R N B Q K B N R

undo
r n b q k b n r
p p p p p p p p
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
P P P P P P P P
R N B Q K B N R

Here is the GetCommand() so far.

Code: Select all

void GetCommand(Thread* t) {
  s08 line[256], cmd[256], a_move, okay, fs, ts, fs2;
  uMove m;
  u64 pieces;
  s32 ok;

  if (!fgets(line, 256, stdin)) return;
  ok = sscanf(line, "%s", cmd);
  cmd[255] = '\0';

  a_move = true;
  if (line[0] < 'a' || line[0] > 'h') a_move = false;
  if (line[1] < '1' || line[1] > '8') a_move = false;
  if (line[2] < 'a' || line[2] > 'h') a_move = false;
  if (line[3] < '1' || line[3] > '8') a_move = false;

  if (a_move) {
	okay = GenAllMoves(t);
	if (!okay) {
	  printf("Illegal Move/n");
	  return;
	}
	fs = (line[1] - '1') * 8 + (line[0] - 'a');
	ts = (line[3] - '1') * 8 + (line[2] - 'a');
	a_move = false;
	pieces = piece[stm];
	while (pieces) {
	  fs2 = std::countr_zero(pieces);
	  pieces ^= 1ull << fs2;
	  if (fs == fs2) {
		if (gbb[ply][fs] & (1ull << ts)) {
		  a_move = true;
		  break;
		}
	  }
	}
	if (a_move) {
	  m = FillMoveFields(t, fs, ts);
	  MakeMove(t, &m);
	  g[gly] = m;
	  gly++;
	}
	PrintBoard(t);
	return;
  }

  if (!strcmp(cmd, "undo")) {
	gly--;
	m = g[gly];
	TakeBack(t, &m);
	PrintBoard(t);
	return;
  }

}
Castling was not working so I broke the code down to see why it was not working. There were missing (). For the programmer that is new to branchless programming the break down of the code may be easier to understand. Here is that code snippet.

Code: Select all

 
    case WKC:
      gbb[ply][fs] = kingMoves[fs] & notme;
      abb[ply] |= kingMoves[fs];
      tmp = (board[h1] == WRC);
      tmp += ((occ & SWCS) == 0);
      tmp += (AttackedByBlack(t, AWCS) == 0);
      tmp = (tmp == 3);
      gbb[ply][fs] |=  tmp << g1;
      tmp = ((board[a1] == WRC) + (occ & SWCL == 0) + (AttackedByBlack(t, AWCL) == 0) == 3);
      gbb[ply][fs] |= tmp << c1;
      break;
When moves are generated only the bitboards are generated and stored. A move structure has to be filled in and a pointer sent to MakeMove(); Here is that function. It looks long but the pathway through it is very short. It changes the from type [ft] for special types like cep, e1g1 and promotion so no time is lost in MakeMove() trying to figure that out.

Code: Select all

uMove FillMoveFields(Thread* t, s08 fs, s08 ts) {
  uMove m;

  m.s.fs = fs;
  m.s.ts = ts;
  m.s.tt = board[ts];

  m.s.ft = board[fs];
  switch (m.s.ft) {
  case ES:
	// can't get here
	break;
  case WP:
	switch (fs >> 3) {
	case RANK1:
	  // can't get here
	  break;
	case RANK2:
	  m.s.ft += ((fs + 16 == ts) * (WPD - WP));
	  break;
	case RANK3:
	case RANK4:
	  // no action
	  break;
	case RANK5:
	  m.s.ft += ((ts != fs + 8 && !m.s.tt) * (WPE - WP));
	  break;
	case RANK6:
	  // no action
	  break;
	case RANK7:
	  m.s.ft = WPQ;
	  break;
	}
	break;
  case WN:
  case WB:
  case WRC:
  case WR:
  case WQ:
	// no action
	break;
  case WKC:
	m.s.ft += ((ts == g1) * (WCS - WKC) + (ts == c1) * (WCL - WKC));
	break;
  case WK:
	// no action
	break;
  case BP:
	switch (fs >> 3) {
	case RANK1:
	  // can't get here
	  break;
	case RANK2:
	  m.s.ft = BPQ;
	  break;
	case RANK3:
	  // no action
	  break;
	case RANK4:
	  m.s.ft += ((ts != fs - 8 && !m.s.tt) * (BPE - BP));
	  break;
	case RANK5:
	case RANK6:
	  // no action
	  break;
	case RANK7:
	  m.s.ft += ((fs - 16 == ts) * (BPD - BP));
	  break;
	}
	break;
  case BN:
  case BB:
  case BRC:
  case BR:
  case BQ:
	// no action
	break;
  case BKC:
	m.s.ft += ((ts == g8) * (BCS - BKC) + (ts == c8) * (BCL - BKC));
	break;
  }

  return m;
}
Everything still needs more work. Like I said this is just for proof of concept so far.
Mike Sherwin
Posts: 965
Joined: Fri Aug 21, 2020 1:25 am
Location: Planet Earth, Sol system
Full name: Michael J Sherwin

Re: I have eliminated ALL if statements from the move generator :)

Post by Mike Sherwin »

I forgot to mention that when verifying a move I cycled through all the piece bits on purpose to find a matching fs. I did it that way just to verify that the structure was sound. The much faster way is just to make sure a piece of the correct type is on the fs and then check gbb[ply[fs] to see if the ts bit is set which is way faster.
dangi12012
Posts: 1062
Joined: Tue Apr 28, 2020 10:03 pm
Full name: Daniel Infuehr

Re: I have eliminated ALL if statements from the move generator :)

Post by dangi12012 »

Mike Sherwin wrote: Thu Aug 25, 2022 6:52 am I forgot to mention that when verifying a move I cycled through all the piece bits on purpose to find a matching fs. I did it that way just to verify that the structure was sound. The much faster way is just to make sure a piece of the correct type is on the fs and then check gbb[ply[fs] to see if the ts bit is set which is way faster.
Your post inspired me to improve work speed on my projects. I have been working on optimal chess code for a long time now and here is what currently works best (matching the theme of branchlessness)

I use this in my code:

Code: Select all

struct Position; //Canonical Quadboard. 256bits
struct BinaryMove
{
	uint64_t clear[4];
	uint64_t set[4];
};
Position apply_move(Position& pos, const BinaryMove& move)
{
	const uint64_t* source = (uint64_t*)(&pos);
	return
	{
		(source[0] & move.clear[0]) | move.set[0],
		(source[1] & move.clear[1]) | move.set[1],
		(source[2] & move.clear[2]) | move.set[2],
		(source[3] & move.clear[3]) | move.set[3]
	};
}
Which compiles down to this when used in actual code:

Code: Select all

vandps  ymm0, ymm0, ymmword ptr [rsi]
vorps   ymm0, ymm0, ymmword ptr [rdx + 32]
It includes promotion, enpassant, castling and all other special cases as well as these flags: Color, 4 Castling bits, Enpassant target.
So truly a code with no special cases - applicable when you generate all possible movelists accessible from a square.
The movegenerator returns the approriate pointer to the prepared movelist which is also presorted to have captures first.

For example a move from or to H1 can unconditionally clear the castling ability for that side - which correspons to move.clear[0] |= 1ull << 56. This is done at compiletime - or initialisation time.

Master expample:
Branchless can I capture Enpassant looks like this:

Code: Select all

const bool ep_l = ((p_l << 1) & ep) != 0;
const bool epl_ok = (own_king_info.see_H(occ & ~((p_l << 1) | p_l)) & enemy_HV) == 0;
const bool ep_r = ((p_r >> 1) & ep) != 0;
const bool epr_ok = (own_king_info.see_H(occ & ~((p_r >> 1) | p_r)) & enemy_HV) == 0;
When combining the two OK bits into an index. You can have a prepared MoveList*[4] you can directly index into from that calculated branchless index. The MoveList at that location will be 0 to 2 BinaryMove
Worlds-fastest-Bitboard-Chess-Movegenerator
Daniel Inführ - Software Developer
Mike Sherwin
Posts: 965
Joined: Fri Aug 21, 2020 1:25 am
Location: Planet Earth, Sol system
Full name: Michael J Sherwin

Re: I have eliminated ALL if statements from the move generator :)

Post by Mike Sherwin »

Of course I do not understand all of that. My code for capture enpassant detection is only

m.s.ft += ((ts != fs + 8 && !m.s.tt) * (WPE - WP));

The ft (from type) starts off as a WP (white pawn) and changes to a WPE (WP enpassant capture) if the pawn move is not straight forward and the ts is empty. And then in MakeMove() the switch just jumps straight to the WPE case.