Fastest pawn quiet move generation I was able to come with

Discussion of chess software programming and technical issues.

Moderators: bob, hgm, Harvey Williamson

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
User avatar
cdani
Posts: 2104
Joined: Sat Jan 18, 2014 9:24 am
Location: Andorra
Contact:

Fastest pawn quiet move generation I was able to come with

Post by cdani » Sat Jun 10, 2017 4:54 pm

Sure someone can improve this, but I think is really fast.

Code: Select all

b = bitboard of pawns
increment = 8 or 16

	uint8_t cas;
	uint8_t casd;
	switch (popcount(b)) {
	case 8:
		cas = lsb(b);
		b &= b - 1;
		casd = cas + increment;
		ml->moviment = MakeMove(cas, casd);
		ml++;
	case 7:
		cas = lsb(b);
		b &= b - 1;
		casd = cas + increment;
		ml->moviment = MakeMove(cas, casd);
		ml++;
	case 6:
		cas = lsb(b);
		b &= b - 1;
		casd = cas + increment;
		ml->moviment = MakeMove(cas, casd);
		ml++;
	case 5:
		cas = lsb(b);
		b &= b - 1;
		casd = cas + increment;
		ml->moviment = MakeMove(cas, casd);
		ml++;
	case 4:
		cas = lsb(b);
		b &= b - 1;
		casd = cas + increment;
		ml->moviment = MakeMove(cas, casd);
		ml++;
	case 3:
		cas = lsb(b);
		b &= b - 1;
		casd = cas + increment;
		ml->moviment = MakeMove(cas, casd);
		ml++;
	case 2:
		cas = lsb(b);
		b &= b - 1;
		casd = cas + increment;
		ml->moviment = MakeMove(cas, casd);
		ml++;
	case 1:
		cas = lsb(b);
		casd = cas + increment;
		ml->moviment = MakeMove(cas, casd);
		ml++;
	case 0:
		return ml;
	default: //> 8 pawns for strange games or positions
	altremovpeo2 :
		if (b == 0)
			return ml;
		cas = lsb(b);
		casd = cas + increment;
		ml->moviment = MakeMove(cas, casd);
		ml++;
		b &= b - 1;
		goto altremovpeo2;
	}

mkchan
Posts: 83
Joined: Thu Oct 06, 2016 7:17 pm
Location: India
Contact:

Re: Fastest pawn quiet move generation I was able to come wi

Post by mkchan » Sat Jun 10, 2017 5:03 pm

Could do for brevity:

Code: Select all

case 8:
case 7:
case 6:
...
case 2:
      cas = lsb(b);
      b &= b - 1;
      casd = cas + increment;
      ml->moviment = MakeMove(cas, casd);
      ml++;
case 1:
...

elcabesa
Posts: 806
Joined: Sun May 23, 2010 11:32 am
Contact:

Re: Fastest pawn quiet move generation I was able to come wi

Post by elcabesa » Sat Jun 10, 2017 5:10 pm

cdani wrote:Sure someone can improve this, but I think is really fast.

Code: Select all

b = bitboard of pawns
increment = 8 or 16
...
you unrolled the loop, how much faster it's respect to the original loop?
And the most important question, how much elo you gained agaisnt a more readable loop?

User avatar
Kotlov
Posts: 209
Joined: Fri Jul 10, 2015 7:23 pm
Location: Russia

Re: Fastest pawn quiet move generation I was able to come wi

Post by Kotlov » Sat Jun 10, 2017 5:33 pm

"switch - case" is slow construction

mar
Posts: 1992
Joined: Fri Nov 26, 2010 1:00 pm
Location: Czech Republic
Full name: Martin Sedlak

Re: Fastest pawn quiet move generation I was able to come wi

Post by mar » Sat Jun 10, 2017 5:35 pm

cdani wrote:Sure someone can improve this, but I think is really fast.
Well, a Duff machine :)
But where do you check that the squares in front of the pawns are empty?
I assume you mask-shift vacated squares in advance, like

Code: Select all

pawns & shift_backward(~occupied)
What was the performance gain versus a loop? (and overall speed gain?)

User avatar
hgm
Posts: 23635
Joined: Fri Mar 10, 2006 9:06 am
Location: Amsterdam
Full name: H G Muller
Contact:

Re: Fastest pawn quiet move generation I was able to come wi

Post by hgm » Sat Jun 10, 2017 6:08 pm

Rather than incrementing ml all the time, you could have stored the moves in the opposite order, and hard-code the offset w.r.t. the inintial ml:

Code: Select all

switch((p = popcount(b))) {
   case 8:
      cas = lsb(b);
      b &= b - 1;
      casd = cas + increment;
      ml[7].moviment = MakeMove(cas, casd);
   case 7:
      cas = lsb(b);
      b &= b - 1;
      casd = cas + increment;
      ml[6].moviment = MakeMove(cas, casd);
...
  case 0:
    return ml += p;
It is questionable whether an optimizer could do that from your code.

syzygy
Posts: 4451
Joined: Tue Feb 28, 2012 10:56 pm

Re: Fastest pawn quiet move generation I was able to come wi

Post by syzygy » Sat Jun 10, 2017 6:13 pm

hgm wrote:It is questionable whether an optimizer could do that from your code.
The optimizer will certainly not generate code that stores moves in the opposite order.

mar
Posts: 1992
Joined: Fri Nov 26, 2010 1:00 pm
Location: Czech Republic
Full name: Martin Sedlak

Re: Fastest pawn quiet move generation I was able to come wi

Post by mar » Sat Jun 10, 2017 6:15 pm

hgm wrote:Rather than incrementing ml all the time, you could have stored the moves in the opposite order, and hard-code the offset w.r.t. the inintial ml:

Code: Select all

switch((p = popcount(b))) {
   case 8:
      cas = lsb(b);
      b &= b - 1;
      casd = cas + increment;
      ml[7].moviment = MakeMove(cas, casd);
   case 7:
      cas = lsb(b);
      b &= b - 1;
      casd = cas + increment;
      ml[6].moviment = MakeMove(cas, casd);
...
  case 0:
    return ml += p;
It is questionable whether an optimizer could do that from your code.
Hmm, I liked your original suggestion:

Code: Select all

ml += (p = popcount(b));
switch(p) {
case 8:
      ....
      ml[-8].m = ...;
...
case 0:
      return ml;
}

User avatar
cdani
Posts: 2104
Joined: Sat Jan 18, 2014 9:24 am
Location: Andorra
Contact:

Re: Fastest pawn quiet move generation I was able to come wi

Post by cdani » Sat Jun 10, 2017 6:23 pm

mkchan wrote:Could do for brevity:

Code: Select all

case 8:
case 7:
case 6:
...
case 2:
      cas = lsb(b);
      b &= b - 1;
      casd = cas + increment;
      ml->moviment = MakeMove(cas, casd);
      ml++;
case 1:
...
This does not work, as the idea is that case 8 executes the code in case 8 + code in case 7 + code in case 6..., and is not doing this with your code.

Kotlov wrote:"switch - case" is slow construction
Yes, but with jump tables is quite fast.

Code: Select all

	switch (popcount(b)) {
000000013FC0826B  movzx       ecx,al  
000000013FC0826E  cmp         ecx,8  
000000013FC08271  ja          peons_moviments_quiets_o_sense_peoalpas+13Bh (013FC0839Bh)  
000000013FC08277  mov         eax,ecx  
000000013FC08279  lea         rcx,[__rg_language (013FBF0000h)]  
000000013FC08280  mov         r9d,dword ptr [rcx+rax*4+183C8h]  
000000013FC08288  add         r9,rcx  
000000013FC0828B  jmp         r9  
mar wrote: But where do you check that the squares in front of the pawns are empty?
I assume you mask-shift vacated squares in advance, like

Code: Select all

pawns & shift_backward(~occupied)
Yes, b variable has only pawns that can advance.
elcabesa wrote: you unrolled the loop, how much faster it's respect to the original loop?
And the most important question, how much elo you gained agaisnt a more readable loop?
mar wrote: What was the performance gain versus a loop? (and overall speed gain?)
Of course minimal, something like 1100000 n/s to 1100500 n/s to give an idea. Is more for fun than for anything other, and also is interesting to share and to have other ideas. Any elo gain is due to accumulating a few of those things.
hgm wrote:Rather than incrementing ml all the time, you could have stored the moves in the opposite order, and hard-code the offset w.r.t. the inintial ml:
mar wrote: Hmm, I liked your original suggestion:
Nice one! I will try, sure is even better! Thanks.

syzygy
Posts: 4451
Joined: Tue Feb 28, 2012 10:56 pm

Re: Fastest pawn quiet move generation I was able to come wi

Post by syzygy » Sat Jun 10, 2017 6:51 pm

mar wrote:Hmm, I liked your original suggestion:

Code: Select all

ml += (p = popcount(b));
switch(p) {
case 8:
      ....
      ml[-8].m = ...;
...
case 0:
      return ml;
}
OK, that is something the optimizer probably could do, in theory.

Post Reply