CookieCat goes mulithreaded

Discussion of chess software programming and technical issues.

Moderators: hgm, Harvey Williamson, bob

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
Post Reply
User avatar
sje
Posts: 4675
Joined: Mon Mar 13, 2006 6:43 pm

CookieCat goes mulithreaded

Post by sje » Sat Dec 31, 2011 10:15 am

CookieCat goes mulithreaded with the installation of a parallel perft() (bulk counting at penultimate nodes and no transposition assistance). As this is my first foray with multithreaded Pascal, I started with the simplest possible task; later, a multithreaded general search will be implemented.

The new multithreaded perft command mtperft has been tested on Mac OS/X, Linux, and (gasp!) Windows 7. Everything appears to work.

As a matter of personal preference, I try to minimize the amount of conditionally compiled code as I feel that too much of this leads to a hard-to-understand source which is also hard-to-fix. And if conditionally compiled code is needed, then it should be isolated to a small region of the source.

With CookieCat, the addition of multithreading has added one line of conditionally compiled code: the cthreads unit is referenced if the target is Unix-like. There is no way around this with the current implementation of Free Pascal.

There is a second line of conditionally compiled code: if the target is a PowerPC CPU, then the inline routine attribute is ignored. This is needed to work around a compiler bug.

That's it for conditionally compiled code, at least for now.

User avatar
sje
Posts: 4675
Joined: Mon Mar 13, 2006 6:43 pm

Re: CookieCat goes mulithreaded

Post by sje » Sat Dec 31, 2011 12:36 pm

Here's the multithreaded visit-every-node perft code. Note that the only thread specific system routine is BeginThread() called in the main driver once-per-move loop.

Code: Select all

  type

    { Perft block parameter record }

    perftblocktype =
      record
        ifen:      fentype; { Position FEN }
        draft:     Integer; { Draft of calculation }
        prior:     santype; { Prior move SAN }
        mpcount:   nctype;  { Result movepath count }
        doprint:   Boolean; { Subtotal output enable flag }
        completed: Boolean  { Completion flag }
      end;

    perftblockptrtype = ^perftblocktype;

  function PosPerftTask(ptr: Pointer): PtrInt;
    var
      perftblockptr: perftblockptrtype;
      pos: postype;

    function PosPerftSimpleFull(depth: Integer): nctype;
      var
        myresult: nctype;
        gms: gmstype;
        index: Integer;
    begin
      if depth = 0 then
        myresult := 1
      else
        begin
          myresult := 0;
          PosGenerate(pos, gms);
          with gms do
            for index := 0 to movecount - 1 do
              begin
                PosExecute(pos, moves[index]);
                Inc(myresult, PosPerftSimpleFull(depth - 1));
                PosRetract(pos)
              end
        end;
      PosPerftSimpleFull := myresult
    end; { PosPerftSimpleFull }

  begin
    perftblockptr := ptr;
    with perftblockptr^ do
      begin
        PosInit(pos);
        PosDecode(pos, ifen);
        mpcount := PosPerftSimpleFull(draft);
        PosTerm(pos);
        if doprint then
          writeln(prior, ' ', mpcount);
        completed := True
      end;
    PosPerftTask := 0
  end; { PosPerftTask }

  function PosPerftTaskDriver(var pos: postype; limit: Integer; printflag: Boolean): nctype;
    var
      index: Integer;
      gms: gmstype;
      blocks: array[gentype] of perftblocktype;

    function CalcFinished: Integer;
      var
        myresult: Integer;
        index: Integer;
    begin
      myresult := 0;
      for index := 0 to gms.movecount - 1 do
        if blocks[index].completed then
          Inc(myresult);
      CalcFinished := myresult
    end; { CalcFinished }

    function CalcSum: nctype;
      var
        myresult: nctype;
        index: Integer;
    begin
      myresult := 0;
      for index := 0 to gms.movecount - 1 do
        Inc(myresult, blocks[index].mpcount);
      CalcSum := myresult
    end; { CalcSum }

  begin
    PosMetaGenCanonical(pos, gms);
    with gms do
      for index := 0 to movecount - 1 do
        begin
          PosExecute(pos, moves[index]);
          with blocks[index] do
            begin
              ifen := PosEncode(pos);
              draft := limit - 1;
              prior := MoveEncode(moves[index]);
              mpcount := 0;
              doprint := printflag;
              completed := False;
              BeginThread(@PosPerftTask, @blocks[index])
            end;
          PosRetract(pos)
        end;
    while CalcFinished <> gms.movecount do
      Sleep&#40;100&#41;;
    PosPerftTaskDriver &#58;= CalcSum
  end; &#123; PosPerftTaskDriver &#125;

User avatar
sje
Posts: 4675
Joined: Mon Mar 13, 2006 6:43 pm

Limiting the thread count to the core count

Post by sje » Sun Jan 01, 2012 6:25 am

Here's a revised version of the code which limits the calculation thread count to the processor core count. Compared to spamming the processor(s) with a separate thread for each root move, this increases the overall speed, but not by very much.

The CatNap routine used here sleeps the calling thread for 100 milliseconds. RatNap goes for 10 milliseconds while DogNap snoozes for a full second.

Code: Select all

  function PosPerftTask&#40;ptr&#58; Pointer&#41;&#58; PtrInt;
    var
      perftblockptr&#58; perftblockptrtype;
      pos&#58; postype;

    function PosPerftSimpleFull&#40;depth&#58; Integer&#41;&#58; nctype;
      var
        myresult&#58; nctype;
        gms&#58; gmstype;
        index&#58; Integer;
    begin
      if depth = 0 then
        myresult &#58;= 1
      else
        begin
          myresult &#58;= 0;
          PosGenerate&#40;pos, gms&#41;;
          with gms do
            for index &#58;= 0 to movecount - 1 do
              begin
                PosExecute&#40;pos, moves&#91;index&#93;);
                Inc&#40;myresult, PosPerftSimpleFull&#40;depth - 1&#41;);
                PosRetract&#40;pos&#41;
              end
        end;
      PosPerftSimpleFull &#58;= myresult
    end; &#123; PosPerftSimpleFull &#125;

  begin
    perftblockptr &#58;= ptr;
    with perftblockptr^ do
      begin
        PosInit&#40;pos&#41;;
        PosDecode&#40;pos, ifen&#41;;
        mpcount &#58;= PosPerftSimpleFull&#40;draft&#41;;
        PosTerm&#40;pos&#41;;
        completed &#58;= True
      end;
    PosPerftTask &#58;= 0
  end; &#123; PosPerftTask &#125;

  function PosPerftTaskDriver&#40;var pos&#58; postype; limit&#58; Integer; printflag&#58; Boolean&#41;&#58; nctype;
    var
      myresult&#58; nctype;
      corecount, activecount&#58; Integer;
      moveindex&#58; Integer;
      gms&#58; gmstype;
      completedcount&#58; Integer;
      nextmoveindex&#58; Integer;
      coreindex&#58; Integer;
      activevec&#58; array&#91;coretype&#93; of Integer;
      blocks&#58; array&#91;gentype&#93; of perftblocktype;

    function FindFirstFree&#58; Integer;
      var
        myresult&#58; Integer;
        coreindex&#58; Integer;
    begin
      myresult &#58;= -1;
      coreindex &#58;= 0;
      while &#40;myresult < 0&#41; and &#40;coreindex < corecount&#41; do
        if activevec&#91;coreindex&#93; < 0 then
          myresult &#58;= coreindex
        else
          Inc&#40;coreindex&#41;;
      FindFirstFree &#58;= myresult
    end; &#123; FindFirstFree &#125;

    function FindFirstCompleted&#58; Integer;
      var
        myresult&#58; Integer;
        coreindex&#58; Integer;
    begin
      myresult &#58;= -1;
      coreindex &#58;= 0;
      while &#40;myresult < 0&#41; and &#40;coreindex < corecount&#41; do
        if &#40;activevec&#91;coreindex&#93; >= 0&#41; and blocks&#91;activevec&#91;coreindex&#93;&#93;.completed then
          myresult &#58;= coreindex
        else
          Inc&#40;coreindex&#41;;
      FindFirstCompleted &#58;= myresult
    end; &#123; FindFirstCompleted &#125;

    procedure Dispatch&#40;moveindex&#58; Integer&#41;;
      var
        coreindex&#58; Integer;
    begin
      coreindex &#58;= FindFirstFree;
      activevec&#91;coreindex&#93; &#58;= moveindex;
      with blocks&#91;moveindex&#93; do
        begin
          started &#58;= True;
          threadid &#58;= BeginThread&#40;@PosPerftTask, @blocks&#91;moveindex&#93;)
        end;
      Inc&#40;activecount&#41;
    end; &#123; Dispatch &#125;

    procedure Reclaim&#40;coreindex&#58; Integer&#41;;
    begin
      WaitForThreadTerminate&#40;blocks&#91;activevec&#91;coreindex&#93;&#93;.threadid, 0&#41;;
      activevec&#91;coreindex&#93; &#58;= -1;
      Dec&#40;activecount&#41;
    end; &#123; TaskDeactivate &#125;

  begin

    &#123; Initialize counts and moves &#125;

    myresult &#58;= 0;
    corecount &#58;= CalcCoreCount;
    activecount &#58;= 0;
    completedcount &#58;= 0;
    PosMetaGenCanonical&#40;pos, gms&#41;;

    &#123; Initialize the active core vector &#125;

    for coreindex &#58;= coremin to coremax do
      activevec&#91;coreindex&#93; &#58;= -1;

    &#123; Initialize the thread parameter blocks &#125;

    for moveindex &#58;= 0 to gms.movecount - 1 do
      begin
        PosExecute&#40;pos, gms.moves&#91;moveindex&#93;);
        with blocks&#91;moveindex&#93; do
          begin
            threadid &#58;= TThreadID&#40;0&#41;;
            ifen &#58;= PosEncode&#40;pos&#41;;
            draft &#58;= limit - 1;
            prior &#58;= MoveEncode&#40;gms.moves&#91;moveindex&#93;);
            mpcount &#58;= 0;
            started &#58;= False;
            completed &#58;= False
          end;
        PosRetract&#40;pos&#41;
      end;

    &#123; Cycle &#125;

    nextmoveindex &#58;= 0;
    while completedcount < gms.movecount do
      if &#40;nextmoveindex < gms.movecount&#41; and &#40;activecount < corecount&#41; then
        begin
          Dispatch&#40;nextmoveindex&#41;;
          Inc&#40;nextmoveindex&#41;
        end
      else
        begin
          coreindex &#58;= FindFirstCompleted;
          if coreindex >= 0 then
            begin
              Reclaim&#40;coreindex&#41;;
              Inc&#40;completedcount&#41;
            end
          else
            CatNap
        end;

    &#123; Scan/print results &#125;

    for moveindex &#58;= 0 to gms.movecount - 1 do
      with blocks&#91;moveindex&#93; do
        begin
          if printflag then
            WriteStrNL&#40;Output, prior + ' ' + EncodeUi64Type&#40;mpcount&#41;);
          Inc&#40;myresult, mpcount&#41;
        end;

    &#123; Assign final result and exit &#125;

    PosPerftTaskDriver &#58;= myresult
  end; &#123; PosPerftTaskDriver &#125;

Post Reply