Fastest pawn quiet move generation I was able to come with

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

User avatar
cdani
Posts: 2204
Joined: Sat Jan 18, 2014 10:24 am
Location: Andorra

Fastest pawn quiet move generation I was able to come with

Post by cdani »

Sure someone can improve this, but I think is really fast.

Code: Select all

b = bitboard of pawns
increment = 8 or 16

	uint8_t cas;
	uint8_t casd;
	switch (popcount(b)) {
	case 8:
		cas = lsb(b);
		b &= b - 1;
		casd = cas + increment;
		ml->moviment = MakeMove(cas, casd);
		ml++;
	case 7:
		cas = lsb(b);
		b &= b - 1;
		casd = cas + increment;
		ml->moviment = MakeMove(cas, casd);
		ml++;
	case 6:
		cas = lsb(b);
		b &= b - 1;
		casd = cas + increment;
		ml->moviment = MakeMove(cas, casd);
		ml++;
	case 5:
		cas = lsb(b);
		b &= b - 1;
		casd = cas + increment;
		ml->moviment = MakeMove(cas, casd);
		ml++;
	case 4:
		cas = lsb(b);
		b &= b - 1;
		casd = cas + increment;
		ml->moviment = MakeMove(cas, casd);
		ml++;
	case 3:
		cas = lsb(b);
		b &= b - 1;
		casd = cas + increment;
		ml->moviment = MakeMove(cas, casd);
		ml++;
	case 2:
		cas = lsb(b);
		b &= b - 1;
		casd = cas + increment;
		ml->moviment = MakeMove(cas, casd);
		ml++;
	case 1:
		cas = lsb(b);
		casd = cas + increment;
		ml->moviment = MakeMove(cas, casd);
		ml++;
	case 0:
		return ml;
	default: //> 8 pawns for strange games or positions
	altremovpeo2 :
		if (b == 0)
			return ml;
		cas = lsb(b);
		casd = cas + increment;
		ml->moviment = MakeMove(cas, casd);
		ml++;
		b &= b - 1;
		goto altremovpeo2;
	}
mkchan
Posts: 88
Joined: Thu Oct 06, 2016 9:17 pm
Location: India

Re: Fastest pawn quiet move generation I was able to come wi

Post by mkchan »

Could do for brevity:

Code: Select all

case 8:
case 7:
case 6:
...
case 2:
      cas = lsb(b);
      b &= b - 1;
      casd = cas + increment;
      ml->moviment = MakeMove(cas, casd);
      ml++;
case 1:
...
elcabesa
Posts: 855
Joined: Sun May 23, 2010 1:32 pm

Re: Fastest pawn quiet move generation I was able to come wi

Post by elcabesa »

cdani wrote:Sure someone can improve this, but I think is really fast.

Code: Select all

b = bitboard of pawns
increment = 8 or 16
...
you unrolled the loop, how much faster it's respect to the original loop?
And the most important question, how much elo you gained agaisnt a more readable loop?
User avatar
Kotlov
Posts: 266
Joined: Fri Jul 10, 2015 9:23 pm
Location: Russia

Re: Fastest pawn quiet move generation I was able to come wi

Post by Kotlov »

"switch - case" is slow construction
mar
Posts: 2554
Joined: Fri Nov 26, 2010 2:00 pm
Location: Czech Republic
Full name: Martin Sedlak

Re: Fastest pawn quiet move generation I was able to come wi

Post by mar »

cdani wrote:Sure someone can improve this, but I think is really fast.
Well, a Duff machine :)
But where do you check that the squares in front of the pawns are empty?
I assume you mask-shift vacated squares in advance, like

Code: Select all

pawns & shift_backward(~occupied)
What was the performance gain versus a loop? (and overall speed gain?)
User avatar
hgm
Posts: 27789
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Fastest pawn quiet move generation I was able to come wi

Post by hgm »

Rather than incrementing ml all the time, you could have stored the moves in the opposite order, and hard-code the offset w.r.t. the inintial ml:

Code: Select all

switch((p = popcount(b))) {
   case 8:
      cas = lsb(b);
      b &= b - 1;
      casd = cas + increment;
      ml[7].moviment = MakeMove(cas, casd);
   case 7:
      cas = lsb(b);
      b &= b - 1;
      casd = cas + increment;
      ml[6].moviment = MakeMove(cas, casd);
...
  case 0:
    return ml += p;
It is questionable whether an optimizer could do that from your code.
syzygy
Posts: 5557
Joined: Tue Feb 28, 2012 11:56 pm

Re: Fastest pawn quiet move generation I was able to come wi

Post by syzygy »

hgm wrote:It is questionable whether an optimizer could do that from your code.
The optimizer will certainly not generate code that stores moves in the opposite order.
mar
Posts: 2554
Joined: Fri Nov 26, 2010 2:00 pm
Location: Czech Republic
Full name: Martin Sedlak

Re: Fastest pawn quiet move generation I was able to come wi

Post by mar »

hgm wrote:Rather than incrementing ml all the time, you could have stored the moves in the opposite order, and hard-code the offset w.r.t. the inintial ml:

Code: Select all

switch((p = popcount(b))) {
   case 8:
      cas = lsb(b);
      b &= b - 1;
      casd = cas + increment;
      ml[7].moviment = MakeMove(cas, casd);
   case 7:
      cas = lsb(b);
      b &= b - 1;
      casd = cas + increment;
      ml[6].moviment = MakeMove(cas, casd);
...
  case 0:
    return ml += p;
It is questionable whether an optimizer could do that from your code.
Hmm, I liked your original suggestion:

Code: Select all

ml += (p = popcount(b));
switch(p) {
case 8:
      ....
      ml[-8].m = ...;
...
case 0:
      return ml;
}
User avatar
cdani
Posts: 2204
Joined: Sat Jan 18, 2014 10:24 am
Location: Andorra

Re: Fastest pawn quiet move generation I was able to come wi

Post by cdani »

mkchan wrote:Could do for brevity:

Code: Select all

case 8:
case 7:
case 6:
...
case 2:
      cas = lsb(b);
      b &= b - 1;
      casd = cas + increment;
      ml->moviment = MakeMove(cas, casd);
      ml++;
case 1:
...
This does not work, as the idea is that case 8 executes the code in case 8 + code in case 7 + code in case 6..., and is not doing this with your code.

Kotlov wrote:"switch - case" is slow construction
Yes, but with jump tables is quite fast.

Code: Select all

	switch (popcount(b)) {
000000013FC0826B  movzx       ecx,al  
000000013FC0826E  cmp         ecx,8  
000000013FC08271  ja          peons_moviments_quiets_o_sense_peoalpas+13Bh (013FC0839Bh)  
000000013FC08277  mov         eax,ecx  
000000013FC08279  lea         rcx,[__rg_language (013FBF0000h)]  
000000013FC08280  mov         r9d,dword ptr [rcx+rax*4+183C8h]  
000000013FC08288  add         r9,rcx  
000000013FC0828B  jmp         r9  
mar wrote: But where do you check that the squares in front of the pawns are empty?
I assume you mask-shift vacated squares in advance, like

Code: Select all

pawns & shift_backward(~occupied)
Yes, b variable has only pawns that can advance.
elcabesa wrote: you unrolled the loop, how much faster it's respect to the original loop?
And the most important question, how much elo you gained agaisnt a more readable loop?
mar wrote: What was the performance gain versus a loop? (and overall speed gain?)
Of course minimal, something like 1100000 n/s to 1100500 n/s to give an idea. Is more for fun than for anything other, and also is interesting to share and to have other ideas. Any elo gain is due to accumulating a few of those things.
hgm wrote:Rather than incrementing ml all the time, you could have stored the moves in the opposite order, and hard-code the offset w.r.t. the inintial ml:
mar wrote: Hmm, I liked your original suggestion:
Nice one! I will try, sure is even better! Thanks.
syzygy
Posts: 5557
Joined: Tue Feb 28, 2012 11:56 pm

Re: Fastest pawn quiet move generation I was able to come wi

Post by syzygy »

mar wrote:Hmm, I liked your original suggestion:

Code: Select all

ml += (p = popcount(b));
switch(p) {
case 8:
      ....
      ml[-8].m = ...;
...
case 0:
      return ml;
}
OK, that is something the optimizer probably could do, in theory.