Serialization symmetry between black and white

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

gcahilly
Posts: 7
Joined: Wed Mar 10, 2021 6:47 pm
Full name: Glen Cahilly

Serialization symmetry between black and white

Post by gcahilly »

Hey all,

I've been using "__builtin_ctzll()" to bitscanforward and serialize bitboards for purposes of move generation. However, this has led to an extreme disparity between generation times for white and for black. Black move generation takes around 8-9x as much time as white move generation. I tried using "__builtin_clzll()" (bsr) for black, but this didn't really change anything. I've poured over CPW and other parts of the web for hours now and can't seem to find anything that works. What do you guys recommend?

Thank you in advance!
mar
Posts: 2559
Joined: Fri Nov 26, 2010 2:00 pm
Location: Czech Republic
Full name: Martin Sedlak

Re: Serialization symmetry between black and white

Post by mar »

I don't observe any such discrepancy, at least not in perft.
are you sure your measurements are correct?
Martin Sedlak
gcahilly
Posts: 7
Joined: Wed Mar 10, 2021 6:47 pm
Full name: Glen Cahilly

Re: Serialization symmetry between black and white

Post by gcahilly »

mar wrote: Tue Mar 16, 2021 8:11 pm I don't observe any such discrepancy, at least not in perft.
are you sure your measurements are correct?
You're absolutely correct, my bad. My tests were incorrect.

That said, I have a few questions on move generation efficiency.

1) should I simply be happy and move on with my ~2 million positions per second (in Perft functions)?

2) is this process efficient for generating, say, knight moves: serialize knight BB, then find the possible moves using a precomputed mask, then calculate (mask & ~squaresOccupiedBySameColor) to prevent it from capturing friendly pieces, then finally serialize this end result and store the indices in a vector? Or is all of this serialization inefficient and unnecessary?

3) last one: is this following code the most efficient way to serialize a BB?

Code: Select all

vector<int> serialize(U64 bitboard) {
	vector<int> list;
	while (bitboard) {
	   list.push_back(__builtin_ctzll(bitboard)); // determine bit index of least significant one bit
	   bitboard &= bitboard-1; // reset LS1B
	}
	return list;
}
mar
Posts: 2559
Joined: Fri Nov 26, 2010 2:00 pm
Location: Czech Republic
Full name: Martin Sedlak

Re: Serialization symmetry between black and white

Post by mar »

gcahilly wrote: Wed Mar 17, 2021 4:40 am

Code: Select all

vector<int> serialize(U64 bitboard) {
	vector<int> list;
	while (bitboard) {
	   list.push_back(__builtin_ctzll(bitboard)); // determine bit index of least significant one bit
	   bitboard &= bitboard-1; // reset LS1B
	}
	return list;
}
ouch, don't do this. don't use std::vector to hold moves!

why:
- allocated on heap (unless you use a custom allocator which you don't) => implies extra heap deallocation when your done with the vector as temporary buffer
- push_back grows capacity exponentially but you still get a reallocation/copy a couple of times (unless the compiler is super-smart)
so I bet the slowest part of your serialization loop is actually push_back

for regular chess, maximum # of moves per position is 218 I think (256 a safe bet), std::array is your friend, allocated on the stack somewhere
EDIT: I see now that you use this to extract non-zero bits, not to generate moves yet. in that case 64 is obviously the maximum the buffer needs to hold
Martin Sedlak
gcahilly
Posts: 7
Joined: Wed Mar 10, 2021 6:47 pm
Full name: Glen Cahilly

Re: Serialization symmetry between black and white

Post by gcahilly »

mar wrote: Wed Mar 17, 2021 6:25 am
gcahilly wrote: Wed Mar 17, 2021 4:40 am

Code: Select all

vector<int> serialize(U64 bitboard) {
	vector<int> list;
	while (bitboard) {
	   list.push_back(__builtin_ctzll(bitboard)); // determine bit index of least significant one bit
	   bitboard &= bitboard-1; // reset LS1B
	}
	return list;
}
ouch, don't do this. don't use std::vector to hold moves!

why:
- allocated on heap (unless you use a custom allocator which you don't) => implies extra heap deallocation when your done with the vector as temporary buffer
- push_back grows capacity exponentially but you still get a reallocation/copy a couple of times (unless the compiler is super-smart)
so I bet the slowest part of your serialization loop is actually push_back

for regular chess, maximum # of moves per position is 218 I think (256 a safe bet), std::array is your friend, allocated on the stack somewhere
EDIT: I see now that you use this to extract non-zero bits, not to generate moves yet. in that case 64 is obviously the maximum the buffer needs to hold

Thank you so much, mar! I did as you said and changed to std::array. My move generation is way faster now!

I don't want to bother you too much, but I have been having issues with Quiescence search.

Here is my minimax code. I know it's really ugly that I'm using tuples for storing moves, I plan on changing it soon :D (prototype now, optimize later):

Code: Select all

int minimax(Board b, int depth, int alpha, int beta, bool white) {

    array<tuple<int,int,char,bool,bool,bool>, 256> listOfPseudoMoves = generatePseudoMoves(b, white);

    array<tuple<int,int,char,bool,bool,bool>, 256> listOfLegalMoves = generateLegalMoves(b, listOfPseudoMoves, white);
	
	if ((get<0>(listOfLegalMoves[0])<0) or (get<1>(listOfLegalMoves[0])<0)) {
		//no moves
		if (isInCheck(b, white)) {
			//checkmate
			if (white) {
				return -1000000;
			} else {
				return 1000000;
			}
		} else {
			return 0;
		}
	}

	if (depth == 0) {
		return quiesceEval(b, alpha, beta, white);
	}

	if (white) {
		int maxEval = -1000000;
		
		for (int i=0; i<256; i++) {
			int startSq = get<0>(listOfLegalMoves[i]);
	    		int endSq = get<1>(listOfLegalMoves[i]);
	    		if (startSq>=0 and endSq>=0) {
		    		char movingPiece = get<2>(listOfLegalMoves[i]);
		    		bool promotion = get<3>(listOfLegalMoves[i]);
		    		bool castling = get<4>(listOfLegalMoves[i]);
		    		bool enPassant = get<5>(listOfLegalMoves[i]);

		    		Board c = b;
				c.makeMove(startSq, endSq, movingPiece, promotion, castling, enPassant);

				int eval;

				U64 TTIndex = c.hash%TTSize;
				int TTDepth = transpositionTable[TTIndex].depth;
				int TTEval = transpositionTable[TTIndex].eval;
			

				if (TTDepth>=depth-1 and TTDepth!=-1) {
					//use TT info
					eval = TTEval;
					numTranspositions++;
				} else {
					eval=minimax(c, depth-1, alpha, beta, !white);
					
					//overwrite transposition table entry
					transpositionTable[TTIndex].depth = depth-1;
					transpositionTable[TTIndex].eval = eval;
				}

				alpha = max(alpha, eval);
				maxEval = max(maxEval, eval);
				if (beta <= alpha) {
					i=256; //BREAK
				}				

			} else {
				i=256; //BREAK
			}
		}
		return maxEval;
	}

	else {
		int minEval = 1000000;
		for (int i=0; i<256; i++) {
			int startSq = get<0>(listOfLegalMoves[i]);
	    		int endSq = get<1>(listOfLegalMoves[i]);
	    		if (startSq>=0 and endSq>=0) {
		    		char movingPiece = get<2>(listOfLegalMoves[i]);
		    		bool promotion = get<3>(listOfLegalMoves[i]);
		    		bool castling = get<4>(listOfLegalMoves[i]);
		    		bool enPassant = get<5>(listOfLegalMoves[i]);

		    		Board c = b;
				c.makeMove(startSq, endSq, movingPiece, promotion, castling, enPassant);

				int eval;

				U64 TTIndex = c.hash%TTSize;
				int TTDepth = transpositionTable[TTIndex].depth;
				int TTEval = transpositionTable[TTIndex].eval;
			
				if (TTDepth>=depth-1 and TTDepth!=-1) {
					//use TT info
					eval = TTEval;
					numTranspositions++;

				} else {
					eval=minimax(c, depth-1, alpha, beta, !white);
					
					//overwrite transposition table entry
					transpositionTable[TTIndex].depth = depth-1;
					transpositionTable[TTIndex].eval = eval;
				}

				beta = min(beta, eval);

				minEval = min(minEval, eval);

				if (beta <= alpha) {
					i=256;
				}

			} else {
				i=256;
			}
		}
		return minEval;
	}
}
This minimax code was working fine and the engine was rated around 1500 when I simply used a static evaluation at depth=0.

But now that I changed that depth=0 instruction to Quiesce, it doesn't work.

I've looked around quite a bit on the web and have spent a lot of time tinkering with my own code to try to fix it, but nothing has worked.

Here is my Quiesce code:

Code: Select all


int quiesceEval(Board b, int alpha, int beta, bool white) {
  
    int score = evaluation(b, white);

    if (score >= beta) return score;

    array<tuple<int,int,char,bool,bool,bool>, 256> pseudoCaptures = generatePseudoCaptures(b, white);
    array<tuple<int,int,char,bool,bool,bool>, 256> legalCaptures = generateLegalMoves(b, pseudoCaptures, white);

    for (int i=0; i<256; i++) {

       	Board c = b;
        int startSq = get<0>(legalCaptures[i]);
    	int endSq = get<1>(legalCaptures[i]);
    	if (startSq>=0 and endSq>=0) {
	    	char movingPiece = get<2>(legalCaptures[i]);
	    	bool promotion = get<3>(legalCaptures[i]);
	    	bool castling = get<4>(legalCaptures[i]);
	    	bool enPassant = get<5>(legalCaptures[i]);

	    	c.makeMove(startSq, endSq, movingPiece, promotion, castling, enPassant); 

	        score = -quiesceEval(c, -beta, -alpha, !white);

	        if (score >= alpha) {
	            alpha = score;
	            if (score >= beta) {
	            	i=256;
	            }
	        }

	    } else {
	    	i=256;
	    }
    }
    return score;
}


Again, any help would be very much appreciated! I'm more than willing to explain any parts of my code if they don't make sense to you, just ask :) .

-Glen
mar
Posts: 2559
Joined: Fri Nov 26, 2010 2:00 pm
Location: Czech Republic
Full name: Martin Sedlak

Re: Serialization symmetry between black and white

Post by mar »

gcahilly wrote: Sat Mar 20, 2021 2:36 am Again, any help would be very much appreciated! I'm more than willing to explain any parts of my code if they don't make sense to you, just ask :) .

-Glen
You're pushing your luck. You should debug your code, not waste other people's time. Is this some form of sophisticated trolling?

Your quiescence uses negamax, but your search doesn't - I suggest you rewrite what you call search (minimax) to use negamax as well => you'll save those extra ifs and reduce
the probability of introducing new bugs in the future, because you won't have to duplicate large chunks of code that only differ in alpha/beta logic. just do negamax where the logic is the same => you always improve alpha from stm's point of view.
EDIT: so, if you want to "fix" the call to quiesce, you should probably add extra if and call negamaxed version of quiesce if stm is black
Martin Sedlak
gcahilly
Posts: 7
Joined: Wed Mar 10, 2021 6:47 pm
Full name: Glen Cahilly

Re: Serialization symmetry between black and white

Post by gcahilly »

mar wrote: Sat Mar 20, 2021 11:10 am
gcahilly wrote: Sat Mar 20, 2021 2:36 am Again, any help would be very much appreciated! I'm more than willing to explain any parts of my code if they don't make sense to you, just ask :) .

-Glen
You're pushing your luck. You should debug your code, not waste other people's time. Is this some form of sophisticated trolling?

Your quiescence uses negamax, but your search doesn't - I suggest you rewrite what you call search (minimax) to use negamax as well => you'll save those extra ifs and reduce
the probability of introducing new bugs in the future, because you won't have to duplicate large chunks of code that only differ in alpha/beta logic. just do negamax where the logic is the same => you always improve alpha from stm's point of view.
EDIT: so, if you want to "fix" the call to quiesce, you should probably add extra if and call negamaxed version of quiesce if stm is black
Really sorry. I spent many hours debugging and couldn't figure it out. Sorry for wasting your time. I really appreciate your help and will try extra hard in the future to debug before looking for help.
Sven
Posts: 4052
Joined: Thu May 15, 2008 9:57 pm
Location: Berlin, Germany
Full name: Sven Schüle

Re: Serialization symmetry between black and white

Post by Sven »

Your QS function returns "score" which is a random score (of the last subtree visited, unless there was a beta cutoff). You need to return the best score which is either "alpha" or a "bestScore" you track separately. And that is already my second remark: your QS uses a mixture of fail-hard and fail-soft alpha-beta. In a consistent fail-hard implementation, for instance, you would return beta (not score) in case of a cutoff, and otherwise alpha.

EDIT: And as already suggested, your QS uses negamax so it returns a score from the moving side's viewpoint while your minimax search always returns a score from the viewpoint of White. So in minimax the return value of QS needs to be negated when it is Black's turn.
Sven Schüle (engine author: Jumbo, KnockOut, Surprise)