knbk DTM

Discussion of chess software programming and technical issues.

Moderator: Ras

mar
Posts: 2654
Joined: Fri Nov 26, 2010 2:00 pm
Location: Czech Republic
Full name: Martin Sedlak

knbk DTM

Post by mar »

I recently wrote a simple dumb knbk DTM generator mostly for fun, my DTM aren't minimal but it's still enough to win with depth 1 search
I think did my best but it still takes ~500ms to generate it on my machine (single thread), gcc is even 70ms worse than microsoft compiler for some reason.

I'm using 3-way symmetry for one of the kings (horizontal, vertical and diagonal)
for both sides combined, the number of indices is 5242880 (I didn't bother to remove invalid dup indices as that would make things a lot more complicated) - ~5MB of RAM is what I'd be willing to sacrifice.

That 500-600ms still way too much to generate at engine startup and I'd like to avoid loading/embedding comressed data, even for one side that compressed best that would be some ~500kb and I'd have to do a 1-ply search for the other side when probing.

I surely could parallelize the init but that would be bad anyway if someone would like to play many games at once, but probably the least "bad" solution.
I certainly could run a background thread, but that might take away resources from the opponent, which is not what I want to do.

So my question is - is it possible to generate knbk much faster than in 500ms on a single thread? I just want to make sure I'm not missing something obvious.
I do simple retrograde with two bitsets to mark next unassigned indices (nodes) to process, swapped at each iteration along with stm - the bitset iteration is then optimized by using 64-bit packed ints and iterating the standard bitboard way

the price for near-perfection seems a bit too high and I should've probably spent some time improving my heuristics instead.
User avatar
hgm
Posts: 28353
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: knbk DTM

Post by hgm »

IIRC the 'PrettyFast' generator by Marcel van Kervinck did it in 110ms on my 2.2GHz laptop.
mar
Posts: 2654
Joined: Fri Nov 26, 2010 2:00 pm
Location: Czech Republic
Full name: Martin Sedlak

Re: knbk DTM

Post by mar »

hgm wrote: Wed May 05, 2021 2:43 pm IIRC the 'PrettyFast' generator by Marcel van Kervinck did it in 110ms on my 2.2GHz laptop.
wow that's fast. I guess about an order of magnitude faster than mine - exactly what I'm aiming for. so perhaps there's hope yet.
EDIT: was it knbk though? I found this on Marcel's github https://github.com/kervinck/pfkpk and it looks like a kpk generator?
User avatar
hgm
Posts: 28353
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: knbk DTM

Post by hgm »

I am sure it was KBNK. I even converted it to a symmetry-less version that ran in ~600ms, as a first step in my on-the-fly EGT generator (for doing Pawn slices). Which never seems to be high enough on my to-do list to make much progress. I will try to find it.
User avatar
hgm
Posts: 28353
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: knbk DTM

Post by hgm »

This is it:

Code: Select all

/*
 *	pfkbnk.c (pretty fast kbnk generator)
 *
 *	This demonstration program solves the kbnk
 *	endgame by means of the bitmapped database
 *	method as proposed by Urban Koistinen.
 *
 *	written by Marcel van Kervinck
 *	<marcelk@stack.urc.tue.nl>
 *
 *	18 March 1996
 */

/*
 *	platform			cputime (seconds)
 *
 *	68000-7MHz Amiga		1602
 *	486-66MHz FreeBSD		45
 *	Sun Sparc SLC			36
 *	Pentium-90MHz FreeBSD		16
 *	SG Challenge IRIX		12
 */


/* include the usual stuff */

#include <stdlib.h>
#include <stdio.h>
#include <time.h>


/* some 64 bit definitions for gcc, change these for other compilers */

typedef unsigned long long bitmap;
#define I  0x0000000000000001ULL
#define B5 0xffffffff00000000ULL
#define B4 0xffff0000ffff0000ULL
#define B3 0xff00ff00ff00ff00ULL
#define B2 0xf0f0f0f0f0f0f0f0ULL
#define B1 0xccccccccccccccccULL
#define B0 0xaaaaaaaaaaaaaaaaULL
#define W  0x7f7f7f7f7f7f7f7fULL
#define E  0xfefefefefefefefeULL


/* bit tricks */

#define FIRST(b) ((b)&(~(b)+1))	/* return first bit */
	/* compiler should be smart enough to generate b&-b */

#define DROP(b) ((b)&=((b)-1))	/* drop first bit */


/* general macros */

#define MAX(a,b) ((a)>(b)?(a):(b))
#define MIN(a,b) ((a)<(b)?(a):(b))

#define TIME (double)(GetRealTime()-starttime)


/* chess macros */

#define STEP_N(b) ((b) << 8)
#define STEP_S(b) ((b) >> 8)
#define STEP_W(b) (((b) & W) << 1)
#define STEP_E(b) (((b) & E) >> 1)

#define STEP_NE(b) STEP_N(STEP_E(b))
#define STEP_NW(b) STEP_N(STEP_W(b))
#define STEP_SE(b) STEP_S(STEP_E(b))
#define STEP_SW(b) STEP_S(STEP_W(b))

#define KING(b) \
	( STEP_N(STEP_W(b) | STEP_E(b) | (b)) \
	| STEP_W(b) | STEP_E(b) \
	| STEP_S(STEP_W(b) | STEP_E(b) | (b)) )

#define KNIGHT(b) \
	( STEP_N( STEP_N(STEP_W(b)|STEP_E(b)) \
		| STEP_W(STEP_W(b)) \
		| STEP_E(STEP_E(b)) ) \
	| STEP_S( STEP_S(STEP_W(b)|STEP_E(b)) \
		| STEP_W(STEP_W(b)) \
		| STEP_E(STEP_E(b)) ) )

/* memory structures */

bitmap wtm[10][64][64][2];	/* wtm database, 2 bits per position */
				/* 0: score>=mate    1: score<mate   */
				/* 2: score=mate     3: score=mate-1 */

bitmap mask[2][64];		/* bitnumber to bitmap */
short slide[64][4][8];		/* sliding moves */
short king[64][9];		/* king moves */
short knight[64][9];		/* knight moves */

short map[64] = {		/* map king to database index */
	0,1,2,3,0,0,0,0,
	0,4,5,6,0,0,0,0,
	0,0,7,8,0,0,0,0,
	0,0,0,9,0,0,0,0,
	0,0,0,0,0,0,0,0,
	0,0,0,0,0,0,0,0,
	0,0,0,0,0,0,0,0,
	0,0,0,0,0,0,0,0,
};

short mir[64] = {		/* reflection bits */
	0,0,0,0,1,1,1,1,
	4,0,0,0,1,1,1,5,
	4,4,0,0,1,1,5,5,
	4,4,4,0,1,5,5,5,
	6,6,6,2,3,7,7,7,
	6,6,2,2,3,3,7,7,
	6,2,2,2,3,3,3,7,
	2,2,2,2,3,3,3,3,
};

short dia[64] = {		/* reflection over diagonal */
	0, 8,16,24,32,40,48,56,
	1, 9,17,25,33,41,49,57,
	2,10,18,26,34,42,50,58,
	3,11,19,27,35,43,51,59,
	4,12,20,28,36,44,52,60,
	5,13,21,29,37,45,53,61,
	6,14,22,30,38,46,54,62,
	7,15,23,31,39,47,55,63,
};

short factor[64] = {		/* multiplication for position counts */
	4,8,8,8,8,8,8,4,
	8,4,8,8,8,8,4,8,
	8,8,4,8,8,4,8,8,
	8,8,8,4,4,8,8,8,
	8,8,8,4,4,8,8,8,
	8,8,4,8,8,4,8,8,
	8,4,8,8,8,8,4,8,
	4,8,8,8,8,8,8,4,
};


short bitcount(bitmap b)	/* count number of bits */
{
	short i;

	for(i=0; b; b&=b-1)
		i++;

	return i;
}


unsigned init(void)		/* init global data */
{
	bitmap wkb, wbb, wnb, b, c, ray[64][8];
	short wk, wb, wn, i, *m;
	unsigned count;


	/* move tables and others */

	for(wb=0, wbb=I; wb<64; wbb<<=1, wb++) {

		for(b=wbb, i=0; ~b&STEP_NE(b); b|=STEP_NE(b), i++)
			slide[wb][0][i] = wb+7+7*i;
		ray[wb][0] = b & ~wbb;
		slide[wb][0][i] = -1;

		for(b=wbb, i=0; ~b&STEP_NW(b); b|=STEP_NW(b), i++)
			slide[wb][1][i] = wb+9+9*i;
		ray[wb][1] = b & ~wbb;
		slide[wb][1][i] = -1;

		for(b=wbb, i=0; ~b&STEP_SW(b); b|=STEP_SW(b), i++)
			slide[wb][2][i] = wb-7-7*i;
		ray[wb][2] = b & ~wbb;
		slide[wb][2][i] = -1;

		for(b=wbb, i=0; ~b&STEP_SE(b); b|=STEP_SE(b), i++)
			slide[wb][3][i] = wb-9-9*i;
		ray[wb][3] = b & ~wbb;
		slide[wb][3][i] = -1;

		for(i=0, m=king[wb]; i<64; i++)
		if(i!=wb && abs((i%8)-(wb%8))<=1 && abs((i/8)-(wb/8))<=1)
			*m++ = i;
		*m = -1;

		for(i=0, m=knight[wb]; i<64; i++)
		if(abs( ((i%8)-(wb%8)) * ((i/8)-(wb/8)) )==2)
			*m++ = i;
		*m = -1;

		mask[0][wb] = I<<wb;
		mask[1][wb] = ~(I<<wb);

	}


	/* mark relevant illegal positions */

	count = 0;
	for(wk=0, wkb=I; wk<64; wkb<<=1, wk++) if(!mir[wk])
	for(wb=0, wbb=I; wb<64; wbb<<=1, wb++)
	for(wn=0, wnb=I; wn<64; wnb<<=1, wn++) {

		b = KING(wkb) | KNIGHT(wnb);
		for(i=0; i<4; i++) {
			c = ray[wb][i];
			if(c & wkb)
				c &= ~ray[wk][i];
			if(c & wnb)
				c &= ~ray[wn][i];
			b |= c;
		}
		b &= ~wkb;

		wtm[map[wk]][wb][wn][0] = b;
		wtm[map[wk]][wb][wn][1] = 0;
		count += factor[wk]
			* (wk==wb || wk==wn || wb==wn ? 64 : bitcount(b | wkb | wbb | wnb) );
	}

	return count;
}


unsigned new2old(void)	/* mark new found positions as old */
{
	short wk, wb, wn;
	unsigned count;
	bitmap b;

	count = 0;
	for(wk=0; wk<64; wk++) if(!mir[wk])
	for(wb=0; wb<64; wb++)
	for(wn=0; wn<64; wn++) {
		b = ~wtm[map[wk]][wb][wn][0] & wtm[map[wk]][wb][wn][1];
		wtm[map[wk]][wb][wn][0] |= b;
		wtm[map[wk]][wb][wn][1] &= b;
		count += factor[wk] * bitcount(b);
	}
	return count;
}


short index(bitmap b)	/* return bitnumber (pre: bitcount(b)==1) */
{
	return	(b & B5 ? 32 : 0)
	+	(b & B4 ? 16 : 0)
	+	(b & B3 ? 8 : 0)
	+	(b & B2 ? 4 : 0)
	+	(b & B1 ? 2 : 0)
	+	(b & B0 ? 1 : 0);
}


void enter(short wk, short wb, short wn, bitmap b)	/* add wtm positions */
{
	bitmap c;

	if(b) {
		if(mir[wk] & 1) {	/* mirror vertical */
			wk ^= 007;
			wb ^= 007;
			wn ^= 007;
			for(c=b, b=0; c; DROP(c))
				b |= mask[0][index(FIRST(c))^007];
		}
		if(mir[wk] & 2) {	/* mirror horizontal */
			wk ^= 070;
			wb ^= 070;
			wn ^= 070;
			for(c=b, b=0; c; DROP(c))
				b |= mask[0][index(FIRST(c))^070];
		}
		if(mir[wk] & 4) {	/* mirror diagonal */
			wk = dia[wk];
			wb = dia[wb];
			wn = dia[wn];
			for(c=b, b=0; c; DROP(c))
				b |= mask[0][dia[index(FIRST(c))]];
		}

		if( (b &= ~wtm[map[wk]][wb][wn][0]
			& ~wtm[map[wk]][wb][wn][1]) ) {	/* which are new? */

			wtm[map[wk]][wb][wn][1] |= b;	/* enter those */

			if(dia[wk]==wk) {		/* and the doubles */
				wb = dia[wb];
				wn = dia[wn];
				for(c=0; b; DROP(b))
					c |= mask[0][dia[index(FIRST(b))]];
				wtm[map[wk]][wb][wn][1] |= c;
			}
		}
	}
}


void retro_btm(short wk, short wb, short wn, bitmap b)	/* process btm positions */
{
	short i, *m;

	/* find all white moves that lead to the btm positions */


		/* white bishop */

	for(i=0; i<4; i++)
	for(m=slide[wb][i]; *m>=0 && *m!=wk && *m!=wn; m++)
		enter(wk,*m,wn,b & mask[1][*m]);


		/* white knight */

	for(m=knight[wn]; *m>=0; m++)
	if(*m!=wk && *m!=wb)
		enter(wk,wb,*m,b & mask[1][*m]);


		/* white king */

	for(m=king[wk]; *m>=0; m++)
	if(*m!=wb && *m!=wn)
		enter(*m,wb,wn,b & mask[1][*m]);
}


void retro_wtm(short wk, short wb, short wn, bitmap b)	/* process wtm positions */
{
	/* find forced black moves that lead to the wtm positions */


		/* black king */

	if( (b = KING(b) & mask[1][wb] & mask[1][wn]
		& ~KING(~wtm[map[wk]][wb][wn][0])) )
		retro_btm(wk,wb,wn,b);
}


#include <sys/time.h>
#include <sys/times.h>
#include <unistd.h>
int GetRealTime()
{
  struct timeval t;
  gettimeofday(&t, NULL);
  return t.tv_sec*1000000 + t.tv_usec;
}

int main(void)					/* compute database */
{
	bitmap b;
	short wk, wb, wn;
	unsigned count, total;
	int mate;
	time_t starttime;


	/* init */

	starttime = GetRealTime(NULL);

	printf("kbnk wtm\n");
	printf(" status    count time\n");

	total = init();

	printf("illegal %8u (%1.0f)\n", total, TIME);


	/* enter mates */

	mate = 1;

	for(wk=0; wk<64; wk++) if(!mir[wk])
	for(wb=0; wb<64; wb++) if(wb!=wk) if(dia[wk]!=wk || dia[wb]<=wb)
	for(wn=0; wn<64; wn++) if(wn!=wk && wn!=wb) if(dia[wk]!=wk || dia[wb]!=wb || dia[wn]<=wn)
		if( (b = wtm[map[wk]][wb][wn][0] & ~KING(~wtm[map[wk]][wb][wn][0])) )
			retro_btm(wk,wb,wn,b);


	/* iterate */

	while( (count = new2old()) ) {

		total += count;

//		printf("mate%03d %8u (%1.0f)\n", mate, count, TIME);
		count = 0;

		mate++;

		for(wk=0; wk<64; wk++) if(!mir[wk])
		for(wb=0; wb<64; wb++) if(dia[wk]!=wk || dia[wb]<=wb)
		for(wn=0; wn<64; wn++) if(dia[wk]!=wk || dia[wb]!=wb || dia[wn]<=wn)
			if( (b = wtm[map[wk]][wb][wn][0] & wtm[map[wk]][wb][wn][1]) )
				retro_wtm(wk,wb,wn,b);
	}


	/* count left overs */

	for(wk=0; wk<64; wk++) if(!mir[wk])
	for(wb=0; wb<64; wb++) if(wk!=wb)
	for(wn=0; wn<64; wn++) if(wk!=wn && wb!=wn)
		count += factor[wk]
			* bitcount(~wtm[map[wk]][wb][wn][0]
					& mask[1][wk] & mask[1][wb] & mask[1][wn]);

	printf("no_mate %8u (%1.0f)\n", count, TIME);
	total += count;


	/* print total */

	printf("  total %8u (%1.0f)\n", total, TIME);


	/* be happy */

	return 0;
}
mar
Posts: 2654
Joined: Fri Nov 26, 2010 2:00 pm
Location: Czech Republic
Full name: Martin Sedlak

Re: knbk DTM

Post by mar »

hgm wrote: Wed May 05, 2021 3:18 pm This is it:
...snip...
thanks!
it seems Marcel is doing something pretty clever there.
do I understand correctly that the black king is encoded in bits and the information is relative? I count 2 bits, 2 bitboards should cover all 64 squares?
also this TB seems to only occupy 600kb, very interesting, will have to wrap my head around it first
EDIT: and it dates 1996! amazing
User avatar
hgm
Posts: 28353
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: knbk DTM

Post by hgm »

Indeed, the EGT can be seen as a collection of bitboards for the black King. In end-games with a bare King that makes it cheap to detect 'escapes' to non-won positions, which must be in the same bitboard. And all black retro-moving is also just bit manipulation in those bitboards. Only white moves require memory access.