Page 1 of 3

Fastest pawn quiet move generation I was able to come with

Posted: Sat Jun 10, 2017 4:54 pm
by cdani
Sure someone can improve this, but I think is really fast.

Code: Select all

b = bitboard of pawns
increment = 8 or 16

	uint8_t cas;
	uint8_t casd;
	switch (popcount(b)) {
	case 8:
		cas = lsb(b);
		b &= b - 1;
		casd = cas + increment;
		ml->moviment = MakeMove(cas, casd);
		ml++;
	case 7:
		cas = lsb(b);
		b &= b - 1;
		casd = cas + increment;
		ml->moviment = MakeMove(cas, casd);
		ml++;
	case 6:
		cas = lsb(b);
		b &= b - 1;
		casd = cas + increment;
		ml->moviment = MakeMove(cas, casd);
		ml++;
	case 5:
		cas = lsb(b);
		b &= b - 1;
		casd = cas + increment;
		ml->moviment = MakeMove(cas, casd);
		ml++;
	case 4:
		cas = lsb(b);
		b &= b - 1;
		casd = cas + increment;
		ml->moviment = MakeMove(cas, casd);
		ml++;
	case 3:
		cas = lsb(b);
		b &= b - 1;
		casd = cas + increment;
		ml->moviment = MakeMove(cas, casd);
		ml++;
	case 2:
		cas = lsb(b);
		b &= b - 1;
		casd = cas + increment;
		ml->moviment = MakeMove(cas, casd);
		ml++;
	case 1:
		cas = lsb(b);
		casd = cas + increment;
		ml->moviment = MakeMove(cas, casd);
		ml++;
	case 0:
		return ml;
	default: //> 8 pawns for strange games or positions
	altremovpeo2 :
		if (b == 0)
			return ml;
		cas = lsb(b);
		casd = cas + increment;
		ml->moviment = MakeMove(cas, casd);
		ml++;
		b &= b - 1;
		goto altremovpeo2;
	}

Re: Fastest pawn quiet move generation I was able to come wi

Posted: Sat Jun 10, 2017 5:03 pm
by mkchan
Could do for brevity:

Code: Select all

case 8:
case 7:
case 6:
...
case 2:
      cas = lsb(b);
      b &= b - 1;
      casd = cas + increment;
      ml->moviment = MakeMove(cas, casd);
      ml++;
case 1:
...

Re: Fastest pawn quiet move generation I was able to come wi

Posted: Sat Jun 10, 2017 5:10 pm
by elcabesa
cdani wrote:Sure someone can improve this, but I think is really fast.

Code: Select all

b = bitboard of pawns
increment = 8 or 16
...
you unrolled the loop, how much faster it's respect to the original loop?
And the most important question, how much elo you gained agaisnt a more readable loop?

Re: Fastest pawn quiet move generation I was able to come wi

Posted: Sat Jun 10, 2017 5:33 pm
by Kotlov
"switch - case" is slow construction

Re: Fastest pawn quiet move generation I was able to come wi

Posted: Sat Jun 10, 2017 5:35 pm
by mar
cdani wrote:Sure someone can improve this, but I think is really fast.
Well, a Duff machine :)
But where do you check that the squares in front of the pawns are empty?
I assume you mask-shift vacated squares in advance, like

Code: Select all

pawns & shift_backward(~occupied)
What was the performance gain versus a loop? (and overall speed gain?)

Re: Fastest pawn quiet move generation I was able to come wi

Posted: Sat Jun 10, 2017 6:08 pm
by hgm
Rather than incrementing ml all the time, you could have stored the moves in the opposite order, and hard-code the offset w.r.t. the inintial ml:

Code: Select all

switch((p = popcount(b))) {
   case 8:
      cas = lsb(b);
      b &= b - 1;
      casd = cas + increment;
      ml[7].moviment = MakeMove(cas, casd);
   case 7:
      cas = lsb(b);
      b &= b - 1;
      casd = cas + increment;
      ml[6].moviment = MakeMove(cas, casd);
...
  case 0:
    return ml += p;
It is questionable whether an optimizer could do that from your code.

Re: Fastest pawn quiet move generation I was able to come wi

Posted: Sat Jun 10, 2017 6:13 pm
by syzygy
hgm wrote:It is questionable whether an optimizer could do that from your code.
The optimizer will certainly not generate code that stores moves in the opposite order.

Re: Fastest pawn quiet move generation I was able to come wi

Posted: Sat Jun 10, 2017 6:15 pm
by mar
hgm wrote:Rather than incrementing ml all the time, you could have stored the moves in the opposite order, and hard-code the offset w.r.t. the inintial ml:

Code: Select all

switch((p = popcount(b))) {
   case 8:
      cas = lsb(b);
      b &= b - 1;
      casd = cas + increment;
      ml[7].moviment = MakeMove(cas, casd);
   case 7:
      cas = lsb(b);
      b &= b - 1;
      casd = cas + increment;
      ml[6].moviment = MakeMove(cas, casd);
...
  case 0:
    return ml += p;
It is questionable whether an optimizer could do that from your code.
Hmm, I liked your original suggestion:

Code: Select all

ml += (p = popcount(b));
switch(p) {
case 8:
      ....
      ml[-8].m = ...;
...
case 0:
      return ml;
}

Re: Fastest pawn quiet move generation I was able to come wi

Posted: Sat Jun 10, 2017 6:23 pm
by cdani
mkchan wrote:Could do for brevity:

Code: Select all

case 8:
case 7:
case 6:
...
case 2:
      cas = lsb(b);
      b &= b - 1;
      casd = cas + increment;
      ml->moviment = MakeMove(cas, casd);
      ml++;
case 1:
...
This does not work, as the idea is that case 8 executes the code in case 8 + code in case 7 + code in case 6..., and is not doing this with your code.

Kotlov wrote:"switch - case" is slow construction
Yes, but with jump tables is quite fast.

Code: Select all

	switch (popcount(b)) {
000000013FC0826B  movzx       ecx,al  
000000013FC0826E  cmp         ecx,8  
000000013FC08271  ja          peons_moviments_quiets_o_sense_peoalpas+13Bh (013FC0839Bh)  
000000013FC08277  mov         eax,ecx  
000000013FC08279  lea         rcx,[__rg_language (013FBF0000h)]  
000000013FC08280  mov         r9d,dword ptr [rcx+rax*4+183C8h]  
000000013FC08288  add         r9,rcx  
000000013FC0828B  jmp         r9  
mar wrote: But where do you check that the squares in front of the pawns are empty?
I assume you mask-shift vacated squares in advance, like

Code: Select all

pawns & shift_backward(~occupied)
Yes, b variable has only pawns that can advance.
elcabesa wrote: you unrolled the loop, how much faster it's respect to the original loop?
And the most important question, how much elo you gained agaisnt a more readable loop?
mar wrote: What was the performance gain versus a loop? (and overall speed gain?)
Of course minimal, something like 1100000 n/s to 1100500 n/s to give an idea. Is more for fun than for anything other, and also is interesting to share and to have other ideas. Any elo gain is due to accumulating a few of those things.
hgm wrote:Rather than incrementing ml all the time, you could have stored the moves in the opposite order, and hard-code the offset w.r.t. the inintial ml:
mar wrote: Hmm, I liked your original suggestion:
Nice one! I will try, sure is even better! Thanks.

Re: Fastest pawn quiet move generation I was able to come wi

Posted: Sat Jun 10, 2017 6:51 pm
by syzygy
mar wrote:Hmm, I liked your original suggestion:

Code: Select all

ml += (p = popcount(b));
switch(p) {
case 8:
      ....
      ml[-8].m = ...;
...
case 0:
      return ml;
}
OK, that is something the optimizer probably could do, in theory.