OpenCL perft() Technical Issues

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

User avatar
sje
Posts: 4675
Joined: Mon Mar 13, 2006 7:43 pm

Avoiding pseudolegal move generation

Post by sje »

Oscar started with a pseudolegal move generator I copied out of some very old code of mine. It's simple and it works as long as each move is explicitly tested after generation. This is not such a great idea for a perft() program for a number of efficiency reasons. For that matter, I don't think it' s such a good idea for a regular chess search. I've long been a proponent of legal-only move generation, feeling that some extra programming effort up front can pay off later with better overall efficiency and a cleaner design.

So I'm now fixing the move generator to produce only legal moves and have completed this for other than check evasion which needs more work. A big benefit here is that bulk counting is much easier to implement with legal only output.

Running using a single core on an old 2.66 GHz Xeon 5150, Oscar calculated perft(8) = 84,998,978,956 in 53m28s for a rate of about 26.5 MHz. This with no bitboards and no transposition tables.
Last edited by sje on Thu Aug 28, 2014 5:14 pm, edited 1 time in total.
User avatar
sje
Posts: 4675
Joined: Mon Mar 13, 2006 7:43 pm

The log for the above

Post by sje »

Code: Select all

gail:tmp sje$ time ./operft "rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1" 5
4865609

real	0m0.178s
user	0m0.175s
sys	0m0.002s
gail:tmp sje$ time ./operft "rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1" 6
119060324

real	0m4.292s
user	0m4.288s
sys	0m0.003s
gail:tmp sje$ time ./operft "rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1" 7
3195901860

real	2m0.178s
user	2m0.020s
sys	0m0.154s
gail:tmp sje$ time ./operft "rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1" 8
84998978956                        

real	53m28.451s
user	53m25.682s
sys	0m2.541s
The real testing will come only after I get Oscar running under OpenCL. I'll then release the source so that others may join in the fun.
User avatar
sje
Posts: 4675
Joined: Mon Mar 13, 2006 7:43 pm

With an old 400 MHz PowerPC

Post by sje »

Running on a 400 MHz 32 bit PowerPC CPU made in 2000, Oscar can manage a rate of 2.6 MHz for perft(7):

Code: Select all

ruthanne:~/tmp sje$ time ./operft "rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1" 7
3195901860

real	24m15.948s
user	20m25.197s
sys	0m9.604s
The source code is identical for all platforms.
User avatar
sje
Posts: 4675
Joined: Mon Mar 13, 2006 7:43 pm

With a 1.8 GHz 32 bit Intel Atom N270

Post by sje »

7.7 MHz with a 1.8 GHz 32 bit Intel Atom N270:

Code: Select all

sje@brigid:~/tmp$ time ./operft "rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1" 7
3195901860

real	6m55.462s
user	6m54.662s
sys	0m0.104s
If I can dig up the right cables and wall warts, I'll try Oscar on a Raspberry Pi and a BeagleBone Black.
User avatar
sje
Posts: 4675
Joined: Mon Mar 13, 2006 7:43 pm

C struct types as classes

Post by sje »

While Oscar is written entirely in ANSI C and not C++, it is still an object oriented program with objects being instances of nine chess related C struct types:

Code: Select all

// SAN record

typedef struct
{
  char chvec[SanLen];
} SanRec;

// FEN record

typedef struct
{
  char chvec[FenLen];
} FenRec;

// Chess position FEN scalar set

typedef struct
{
  Color good;
  Color evil;
  Cabs  cabs;
  Sq    epsq;
  ui    hmvc;
  ui    fmvn;
} Fss;

// Chess position auxiliary data

typedef struct
{
  bool incheck;
  bool dbcheck;
  TiBV pinned;
  TiBV frozen;
} Apd;

// Chessboard

typedef struct
{
  Man manvec[SqLen];
} Board;

// Position preservation data

typedef struct
{
  Fss    fss;
  Apd    apd;
  Move   move;
} Ppd;

// Ppd stack

typedef struct
{
  ui  count;
  Ppd ppdvec[PlyLen];
} PpdStk;

// Tracker

typedef struct
{
  Sq sqvec[TiLen];
  Ti tivec[SqLen];
  Sq captsqvec[TiLen];
  Ti capttivec[TiLen];
  ui captcount;
} Tracker;

// Chess position

typedef struct
{
  Fss     fss;
  Apd     apd;
  Board   board;
  Tracker tracker;
  PpdStk  ppdstk;
} Position;
For each of the nine chess struct types, there is at least one routine which takes a pointer to an type instance as the routine's first argument. These routines are effectively instance methods, if C had instance methods.

The object method routines:

Code: Select all

static void SanRecReset(SanRec * const sanrecptr)
static void SanRecLoadFromMove(SanRec * const sanrecptr, const Move move)

static void FenRecReset(FenRec * const fenrecptr)
static void FenRecLoadFromNts(FenRec * const fenrecptr, const char * const fenstr)
static void FenRecLoadFromPosition(FenRec * const fenrecptr, const Position * const positionptr)

static void FssReset(Fss * const fssptr)

static void ApdReset(Apd * const apdptr)

static void BoardReset(Board * const boardptr)
static bool BoardTestClearPath(const Board * const boardptr, const Dir dir, const Sq frsq, const Sq tosq)

static void TrackerReset(Tracker * const trackerptr)
static void TrackerLoadFromBoard(Tracker * const trackerptr, const Board * const boardptr)
static void TrackerCapture(Tracker * const trackerptr, const Sq sq)
static void TrackerRestore(Tracker * const trackerptr)
static void TrackerMovement(Tracker * const trackerptr, const Sq frsq, const Sq tosq)

static void PpdReset(Ppd * const ppdptr)

static void PpdStkReset(PpdStk * const ppdstkptr)

static void PositionReset(Position * const positionptr)
static bool PositionSqAttacksSq(const Position * const positionptr, const Sq frsq, const Sq tosq)
static bool PositionTiAttacksSq(const Position * const positionptr, const Ti frti, const Sq tosq)
static ui PositionCountColorAttacksToSq(const Position * const positionptr, const Color color, const Sq tosq)
static bool PositionColorAttacksSq(const Position * const positionptr, const Color color, const Sq tosq)
static Sq PositionGetGoodKingSq(const Position * const positionptr)
static Sq PositionGetEvilKingSq(const Position * const positionptr)
static bool PositionCalcEvilInCheck(const Position * const positionptr)
static void PositionRegenerateApd(Position * const positionptr)
static bool PositionLoadFromFenRec(Position * const positionptr, const FenRec * const fenrecptr)
static void PositionLoadInitialArray(Position * const positionptr)
static void PositionExecute(Position * const positionptr, const Move move)
static void PositionRetract(Position * const positionptr)
static bool PositionTestCastling(Position * const positionptr, const Castling castling)
static bool PositionTestMove(Position * const positionptr, Move const move)
static ui PositionGenerateEvasion(Position * const positionptr, Move * const baseptr)
static ui PositionGenerateNotInCheck(Position * const positionptr, Move * const baseptr)
static ui PositionGenerate(Position * const positionptr, Move * const baseptr)
static ui PositionCountMovesEvasion(Position * const positionptr)
static ui PositionCountMovesNotInCheck(Position * const positionptr)
static ui PositionCountMoves(Position * const positionptr)
static bool PositionHasNoMovesEvasion(Position * const positionptr)
static bool PositionHasNoMovesNotInCheck(Position * const positionptr)
static bool PositionHasNoMoves(Position * const positionptr)
static NodeCount PositionEnumerateMovePaths(Position * const positionptr, const ui draft)
Some utility routines:

Code: Select all

static ui Abs(const si value)
static bool IsCharDigit(const char ch)
static si CalcFileDelta(const Sq frsq, const Sq tosq)
static si CalcRankDelta(const Sq frsq, const Sq tosq)
static bool CalcAdjacentSqStatus(const Sq frsq, const Sq tosq)
static Dir CalcDir(const Sq frsq, const Sq tosq)
static Color MapCharToColor(const char ch)
static Man MapCharToMan(const char ch)
static File MapCharToFile(const char ch)
static Rank MapCharToRank(const char ch)
static Castling MapCharToCastling(const char ch)
static ui EncodeUnsignedInt(char * const chptr, const ui value)
static ui EncodeSq(char * const chptr, const Sq sq)
Initialization/termination routines:

Code: Select all

static void InitNext(void)
static void InitSqSqInfo(void)
static void InitCabsPreservation(void)
static void InitSweeperMask(void)
static void InitSweeperRuns(void)
static void Init(void)
static void Term(void)
The one and only externally available routine:

Code: Select all

NodeCount OscarPerft(const char * const fenstr, const ui draft)
{
  Position position;
  FenRec fenrec;
  NodeCount nodecount;

  Init();
  FenRecLoadFromNts(&fenrec, fenstr);
  PositionLoadFromFenRec(&position, &fenrec);
  nodecount = PositionEnumerateMovePaths(&position, draft);
  Term();
  return nodecount;
}
Sample method:

Code: Select all

static bool BoardTestClearPath(const Board * const boardptr, const Dir dir, const Sq frsq, const Sq tosq)
{
  // This routine checks for a path of vacant squares along a sweep direction.
  
  bool result = true;
  Sq *sqptr = SweeperPtrs[dir][frsq];
  Sq scansq = *sqptr++;

  while (result && (scansq != tosq))
  {
    if (IsManVacant(boardptr->manvec[scansq]))
      scansq = *sqptr++;
    else
      result = false;
  };
  return result;
}
Another sample:

Code: Select all

static bool PositionSqAttacksSq(const Position * const positionptr, const Sq frsq, const Sq tosq)
{
  bool result = false;
  const ui8 infobyte = SqSqInfo[frsq][tosq];

  if (infobyte)
  {
    Dir dir;

    switch (positionptr->board.manvec[frsq])
    {
      case ManWhitePawn:
        if (infobyte & InfoAtkfWP)
          result = true;
        break;

      case ManBlackPawn:
        if (infobyte & InfoAtkfBP)
          result = true;
        break;

      case ManWhiteKnight:
      case ManBlackKnight:
        dir = (Dir) (infobyte & InfoDirMsk);
        if (IsDirCrook(dir))
          result = true;
        break;

      case ManWhiteBishop:
      case ManBlackBishop:
        dir = (Dir) (infobyte & InfoDirMsk);
        if (IsDirDiago(dir) && BoardTestClearPath(&positionptr->board, dir, frsq, tosq))
          result = true;
        break;

      case ManWhiteRook:
      case ManBlackRook:
        dir = (Dir) (infobyte & InfoDirMsk);
        if (IsDirOrtho(dir) && BoardTestClearPath(&positionptr->board, dir, frsq, tosq))
          result = true;
        break;

      case ManWhiteQueen:
      case ManBlackQueen:
        dir = (Dir) (infobyte & InfoDirMsk);
        if (IsDirSweep(dir) && BoardTestClearPath(&positionptr->board, dir, frsq, tosq))
          result = true;
        break;

      case ManWhiteKing:
      case ManBlackKing:
        if (infobyte & InfoAdjSqs)
          result = true;
        break;

      default:
        break;
    };
  };
  return result;
}
User avatar
sje
Posts: 4675
Joined: Mon Mar 13, 2006 7:43 pm

Oscar.h header file available for review

Post by sje »

For your review, the Oscar.h header file:

https://dl.dropboxusercontent.com/u/316 ... ar/Oscar.h

The contents will change slightly as OpenCL capability is added.
User avatar
sje
Posts: 4675
Joined: Mon Mar 13, 2006 7:43 pm

operft.c source file available for review

Post by sje »

The operft.c source file is available for review:

https://dl.dropboxusercontent.com/u/316 ... r/operft.c

The above is just a simple main program for driving Oscar in non OpenCL mode from the command line, one perft() calculation at a time.
User avatar
sje
Posts: 4675
Joined: Mon Mar 13, 2006 7:43 pm

Running on a 3.4 GHz Core i5 (i5-4670)

Post by sje »

Running with a single thread on a 3.4 GHz Core i5 (i5-4670), another Oscar data point:

Code: Select all

amanda:tmp sje$ time ./operft "r4rk1/1pp1qppp/p1np1n2/2b1p1B1/2B1P1b1/P1NP1N2/1PP1QPPP/R4RK1 w - - 0 10" 7
287188994746

real	53m20.486s
user	52m2.747s
sys	0m0.307s
That's a rate of 92.0 MHz when calculating perft(7) for this fairly open position:
[d]r4rk1/1pp1qppp/p1np1n2/2b1p1B1/2B1P1b1/P1NP1N2/1PP1QPPP/R4RK1 w - - 0 10[/d]
jdart
Posts: 4366
Joined: Fri Mar 10, 2006 5:23 am
Location: http://www.arasanchess.org

Re: Oscar the Cat

Post by jdart »

I have four cats, and I also volunteer at the local animal shelter, where I get to see many more. Keeping them off the keyboard is a challenge. I also had to cover the power switch on my server, which is on top of the case, because they would jump up there and turn it off.

--Jon
User avatar
Ajedrecista
Posts: 1969
Joined: Wed Jul 13, 2011 9:04 pm
Location: Madrid, Spain.

Re: Running on a 3.4 GHz Core i5 (i5-4670).

Post by Ajedrecista »

Hello Steven:

I suppose that it is without hash, right? I have just checked this position with 1 GB of hash in an Intel Pentium D930 (of year 2006) and JetChess 1.0.0.0 (single core, 32-bit) computed perft(7) in 475.013 seconds (0:06:55.013), that is, around 6.0459e+8 (leaf nodes)/second. Anyway, I hope you can improve your speed even more.

Regards from Spain.

Ajedrecista.