How to add LP to CorChess?

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

User avatar
Ozymandias
Posts: 1535
Joined: Sun Oct 25, 2009 2:30 am

How to add LP to CorChess?

Post by Ozymandias »

I asked I. Ivec about it, but he's at a loss.
User avatar
Eelco de Groot
Posts: 4567
Joined: Sun Mar 12, 2006 2:40 am
Full name:   

Re: How to add LP to CorChess?

Post by Eelco de Groot »

No experience with Large Pages. Is it difficult because of changes in CorChess relative to Stockfish? If that not the problem, then maybe on Rybka forum somebody could give some advice because several people there made versions of Stockfish with large pages. Not sure but I thought the BYO (Build Your Own) version is also built with LP. And there is a thread from correspondence chessplayer Dragon Mist about compiling Stockfish, it is not totally up to date anymore but he started from the beginning so a tutorial Compiling Large Pages into abrok dev BMI2 SF
Debugging is twice as hard as writing the code in the first
place. Therefore, if you write the code as cleverly as possible, you
are, by definition, not smart enough to debug it.
-- Brian W. Kernighan
User avatar
Ozymandias
Posts: 1535
Joined: Sun Oct 25, 2009 2:30 am

Re: How to add LP to CorChess?

Post by Ozymandias »

Thanks for the reply, I already contacted Marco (Amos @ Rybkaforum) and he tells me that changes will need to be made, to the part that deals with adding LP.

BYO is outdated, newest SF development versions started to give error months ago. The same applies to CorChess, which obviously introduces its own changes.
tpoppins
Posts: 919
Joined: Tue Nov 24, 2015 9:11 pm
Location: upstate

Re: How to add LP to CorChess?

Post by tpoppins »

BYO from last November works fine with the current SF dev. If you can't get it to work it's probably a case of PEBCAK. There was a new version released a month ago, that one was reported to have problems; I haven't tried it though.

The main problem with BYO is that it tries to operate like a black box; the other is that it uses code from other ppl (most importantly MZ) without giving any credit. MZ's LP code is available on Github. It took me about 10 minutes to adapt it to the current SF dev and I'm not even a programmer. POPCNT is the best I can do ATM, but you can use the following patch to make your own LP+NUMA compile. There are a couple of extras from my end:

- suppress display of fail-highs/lows (Dann Corbit's idea, IIRC, on by default)
- a "Use Syzygy" checkbox à la Komodo (checked by default)
- Syzygy probe limit defaulted to 5
- "x64" in the name to distinguish it from Master

numa.cpp and numa.h should be available from the Brainfish source. If you cannot get NUMA to work for some reason just delete the thread.cpp section from the patch and "numa.o" from the makefile section. You can still expect a few warnings from the compiler, they can be ignored.

opt.cpp follows the main patch.

Code: Select all

Index: main.cpp
===================================================================
--- main.cpp	(revision 4636)
+++ main.cpp	(working copy)
@@ -32,10 +32,15 @@
   void init();
 }
 
+void SETUP_PRIVILEGES();
+void FREE_MEM(void *);
+
 int main(int argc, char* argv[]) {
 
   std&#58;&#58;cout << engine_info&#40;) << std&#58;&#58;endl;
-
+  #ifndef BENCH
+    SETUP_PRIVILEGES&#40;);
+  #endif
   UCI&#58;&#58;init&#40;Options&#41;;
   PSQT&#58;&#58;init&#40;);
   Bitboards&#58;&#58;init&#40;);
@@ -49,6 +54,11 @@
 
   UCI&#58;&#58;loop&#40;argc, argv&#41;;
 
+  if &#40;large_use&#41; &#123;
+    FREE_MEM&#40;TT.mem&#41;;
+    TT.mem = nullptr;
+  &#125;
+
   Threads.exit&#40;);
   return 0;
 &#125;
Index&#58; Makefile
===================================================================
--- Makefile	&#40;revision 4636&#41;
+++ Makefile	&#40;working copy&#41;
@@ -39,7 +39,7 @@
 
 ### Object files
 OBJS = benchmark.o bitbase.o bitboard.o endgame.o evaluate.o main.o \
-	material.o misc.o movegen.o movepick.o pawns.o position.o psqt.o \
+	material.o misc.o movegen.o movepick.o numa.o opt.o pawns.o position.o psqt.o \
 	search.o thread.o timeman.o tt.o uci.o ucioption.o syzygy/tbprobe.o
 
 ### ==========================================================================
@@ -140,6 +140,7 @@
 ### 3.1 Selecting compiler &#40;default = gcc&#41;
 
 CXXFLAGS += -Wall -Wcast-qual -fno-exceptions -fno-rtti -std=c++11 $&#40;EXTRACXXFLAGS&#41;
+CXXFLAGS += -march=native
 DEPENDFLAGS += -std=c++11
 LDFLAGS += $&#40;EXTRALDFLAGS&#41;
 
Index&#58; misc.cpp
===================================================================
--- misc.cpp	&#40;revision 4636&#41;
+++ misc.cpp	&#40;working copy&#41;
@@ -130,7 +130,7 @@
       ss << setw&#40;2&#41; << day << setw&#40;2&#41; << &#40;1 + months.find&#40;month&#41; / 4&#41; << year.substr&#40;2&#41;;
   &#125;
 
-  ss << &#40;Is64Bit ? " 64" &#58; "")
+  ss << &#40;Is64Bit ? " x64" &#58; "")
      << &#40;HasPext ? " BMI2" &#58; &#40;HasPopCnt ? " POPCNT" &#58; ""))
      << &#40;to_uci  ? "\nid author "&#58; " by ")
      << "T. Romstad, M. Costalba, J. Kiiski, G. Linscott";
Index&#58; search.cpp
===================================================================
--- search.cpp	&#40;revision 4636&#41;
+++ search.cpp	&#40;working copy&#41;
@@ -414,7 +414,8 @@
               if (   mainThread
                   && multiPV == 1
                   && &#40;bestValue <= alpha || bestValue >= beta&#41;
-                  && Time.elapsed&#40;) > 3000&#41;
+                  && Time.elapsed&#40;) > 3000
+				  && Options&#91;"Show Fail-highs and Fail-lows"&#93;)
                   sync_cout << UCI&#58;&#58;pv&#40;rootPos, rootDepth, alpha, beta&#41; << sync_endl;
 
               // In case of failing low/high increase aspiration window and
@@ -648,7 +649,7 @@
     &#125;
 
     // Step 4a. Tablebase probe
-    if (!rootNode && TB&#58;&#58;Cardinality&#41;
+    if (!rootNode && TB&#58;&#58;Cardinality && Options&#91;"Use Syzygy"&#93;)
     &#123;
         int piecesCount = pos.count<ALL_PIECES>();
 
@@ -850,8 +851,8 @@
 
       ss->moveCount = ++moveCount;
 
-      if &#40;rootNode && thisThread == Threads.main&#40;) && Time.elapsed&#40;) > 3000&#41;
-          sync_cout << "info depth " << depth / ONE_PLY
+	  if &#40;rootNode && thisThread == Threads.main&#40;) && Time.elapsed&#40;) > 3000&#41;
+		            sync_cout << "info depth " << depth / ONE_PLY
                     << " currmove " << UCI&#58;&#58;move&#40;move, pos.is_chess960&#40;))
                     << " currmovenumber " << moveCount + thisThread->PVIdx << sync_endl;
 
Index&#58; thread.cpp
===================================================================
--- thread.cpp	&#40;revision 4636&#41;
+++ thread.cpp	&#40;working copy&#41;
@@ -22,6 +22,7 @@
 #include <cassert>
 
 #include "movegen.h"
+#include "numa.h"
 #include "search.h"
 #include "thread.h"
 #include "uci.h"
@@ -94,6 +95,8 @@
 
 void Thread&#58;&#58;idle_loop&#40;) &#123;
 
+  Numa&#58;&#58;instance&#40;).bindThisThread&#40;idx&#41;;
+
   WinProcGroup&#58;&#58;bindThisThread&#40;idx&#41;;
 
   while (!exit&#41;
Index&#58; tt.cpp
===================================================================
--- tt.cpp	&#40;revision 4636&#41;
+++ tt.cpp	&#40;working copy&#41;
@@ -26,7 +26,10 @@
 
 TranspositionTable TT; // Our global transposition table
 
+void CREATE_MEM2&#40;void **,uint64_t&#41;;
+void FREE_MEM&#40;void *);
 
+
 /// TranspositionTable&#58;&#58;resize&#40;) sets the size of the transposition table,
 /// measured in megabytes. Transposition table consists of a power of 2 number
 /// of clusters and each cluster consists of ClusterSize number of TTEntry.
@@ -40,8 +43,18 @@
 
   clusterCount = newClusterCount;
 
+  mem = nullptr;
+  FREE_MEM&#40;mem&#41;;
+  CREATE_MEM2&#40;&mem, clusterCount * sizeof&#40;Cluster&#41;);
+  large_use = true;
+
+  if (!mem&#41;
+  &#123;
+
   free&#40;mem&#41;;
   mem = calloc&#40;clusterCount * sizeof&#40;Cluster&#41; + CacheLineSize - 1, 1&#41;;
+    large_use = false;
+  &#125;
 
   if (!mem&#41;
   &#123;
Index&#58; tt.h
===================================================================
--- tt.h	&#40;revision 4636&#41;
+++ tt.h	&#40;working copy&#41;
@@ -83,6 +83,9 @@
 /// cache lines. This ensures best cache performance, as the cacheline is
 /// prefetched, as soon as possible.
 
+extern int large_use;
+void FREE_MEM &#40;void *);
+
 class TranspositionTable &#123;
 
   static const int CacheLineSize = 64;
@@ -96,7 +99,8 @@
   static_assert&#40;CacheLineSize % sizeof&#40;Cluster&#41; == 0, "Cluster size incorrect");
 
 public&#58;
- ~TranspositionTable&#40;) &#123; free&#40;mem&#41;; &#125;
+  void* mem;
+ ~TranspositionTable&#40;) &#123; large_use ? FREE_MEM &#40;mem&#41; &#58; free&#40;mem&#41;; &#125;
   void new_search&#40;) &#123; generation8 += 4; &#125; // Lower 2 bits are used by Bound
   uint8_t generation&#40;) const &#123; return generation8; &#125;
   TTEntry* probe&#40;const Key key, bool& found&#41; const;
@@ -112,7 +116,6 @@
 private&#58;
   size_t clusterCount;
   Cluster* table;
-  void* mem;
   uint8_t generation8; // Size must be not bigger than TTEntry&#58;&#58;genBound8
 &#125;;
 
Index&#58; ucioption.cpp
===================================================================
--- ucioption.cpp	&#40;revision 4636&#41;
+++ ucioption.cpp	&#40;working copy&#41;
@@ -70,10 +70,12 @@
   o&#91;"Slow Mover"&#93;            << Option&#40;89, 10, 1000&#41;;
   o&#91;"nodestime"&#93;             << Option&#40;0, 0, 10000&#41;;
   o&#91;"UCI_Chess960"&#93;          << Option&#40;false&#41;;
+  o&#91;"Use Syzygy"&#93;            << Option&#40;true&#41;;
   o&#91;"SyzygyPath"&#93;            << Option&#40;"<empty>", on_tb_path&#41;;
   o&#91;"SyzygyProbeDepth"&#93;      << Option&#40;1, 1, 100&#41;;
   o&#91;"Syzygy50MoveRule"&#93;      << Option&#40;true&#41;;
-  o&#91;"SyzygyProbeLimit"&#93;      << Option&#40;6, 0, 6&#41;;
+  o&#91;"SyzygyProbeLimit"&#93;      << Option&#40;5, 0, 6&#41;;
+  o&#91;"Show Fail-highs and Fail-lows"&#93; << Option&#40;false&#41;;
 &#125;
 
 

opt.cpp

Code: Select all

/*
  Stockfish, a UCI chess playing engine derived from Glaurung 2.1
  Copyright&#40;C&#41;2004-2008 Tord Romstad &#40;Glaurung author&#41;
  Copyright&#40;C&#41;2008-2014 Marco Costalba, Joona Kiiski, Tord Romstad

  Stockfish is free software&#58; you can redistribute it and/or modify
  it under the terms of the GNU General Public License as published by
  the Free Software Foundation,either version 3 of the License, or
  &#40;at your option&#41; any later version.

  Stockfish is distributed in the hope that it will be useful,
  but WITHOUT ANY WARRANTY; without even the implied warranty of
  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
  GNU General Public License for more details.

  You should have received a copy of the GNU General Public License
  along with this program. If not, see <http&#58;//www.gnu.org/licenses/>.
*/

#include <stdio.h>

#include "thread.h"
#include "tt.h"

using namespace std;

#define TRUE 1
#define FALSE 0

#define MEMALIGN&#40;a, b, c&#41; a = _aligned_malloc &#40;c, b&#41; 
#define ALIGNED_FREE&#40;x&#41; _aligned_free &#40;x&#41;

int large_use;

#ifndef _WIN32 // Linux 
#include <sys/ipc.h>
#include <sys/shm.h>

static int num;

void SETUP_PRIVILEGES&#40;)&#123;&#125;

void CREATE_MEM&#40;void** A,int align,uint64_t size&#41;
&#123;
	large_use=FALSE;

	num=shmget&#40;IPC_PRIVATE,size,IPC_CREAT|SHM_R|SHM_W|SHM_HUGETLB&#41;;
	if&#40;num==-1&#41;
	&#123;
		printf&#40;"info string LargePages FAILED %llu Mb\n",size>>20&#41;;
		MEMALIGN&#40;(*A&#41;,align,size&#41;;
	&#125;
	else
	&#123;
		(*A&#41;=shmat&#40;num,NULL,0x0&#41;;
		large_use=TRUE;
		printf&#40;"info string LargePages OK %llu Mb\n",size>>20&#41;;
		std&#58;&#58;cout<<"info string HUGELTB "<<&#40;size>>20&#41;<<std&#58;&#58;endl;
	&#125;
&#125;

void CREATE_MEM2&#40;void** A,uint64_t size&#41;
&#123;
	large_use=FALSE;
	num=shmget&#40;IPC_PRIVATE,size,IPC_CREAT|SHM_R|SHM_W|SHM_HUGETLB&#41;;
	if&#40;num!=-1&#41;
	&#123;
		(*A&#41;=shmat&#40;num,NULL,0x0&#41;;
		large_use=TRUE;
		printf&#40;"info string %llu Mb LargePages\n",size>>20&#41;;
	&#125;
&#125;

void FREE_MEM&#40;void* A&#41;
&#123;
	if&#40;!A&#41;
		return;
	if&#40;!large_use&#41;
	&#123;
		ALIGNED_FREE&#40;A&#41;;
		return;
	&#125;
	shmdt&#40;A&#41;;
	shmctl&#40;num,IPC_RMID,NULL&#41;;
	large_use=FALSE;
&#125;

void SETUP_PRIVILEGES&#40;)&#123;&#125;

#else

void CREATE_MEM&#40;void** A,int align,uint64_t size&#41;
&#123;
	large_use=FALSE;
	(*A&#41;=VirtualAlloc&#40;NULL,size,MEM_LARGE_PAGES|MEM_COMMIT|MEM_RESERVE,PAGE_READWRITE&#41;;
	if&#40;(*A&#41;)
	&#123;
		large_use=TRUE;
		printf&#40;"info string %llu Mb LargePages\n",size>>20&#41;;
	&#125;
	else
	&#123;
		printf&#40;"info string %llu Mb&#40;no LargePages&#41;\n",size>>20&#41;;
		MEMALIGN&#40;(*A&#41;,align,size&#41;;
	&#125;
&#125;

void CREATE_MEM2&#40;void** A,uint64_t size&#41;
&#123;
	large_use=FALSE;
	(*A&#41;=VirtualAlloc&#40;NULL,size,MEM_LARGE_PAGES|MEM_COMMIT|MEM_RESERVE,PAGE_READWRITE&#41;;
	if&#40;(*A&#41;)
	&#123;
		large_use=TRUE;
		printf&#40;"Large Pages enabled. Hash %llu Mb\n",size>>20&#41;;
	&#125;
&#125;

void FREE_MEM&#40;void* A&#41;
&#123;
	if&#40;!A&#41;
		return;
	if&#40;!large_use&#41;
		ALIGNED_FREE&#40;A&#41;;
	else
	&#123;
		VirtualFree&#40;A,0,MEM_RELEASE&#41;;
		large_use=FALSE;
	&#125;
&#125;

void SETUP_PRIVILEGES&#40;)
&#123;
	HANDLE TH,PROC;
	TOKEN_PRIVILEGES tp;

	PROC=GetCurrentProcess&#40;);
	OpenProcessToken&#40;PROC,TOKEN_ADJUST_PRIVILEGES|TOKEN_QUERY,&TH&#41;;
	LookupPrivilegeValue&#40;NULL,TEXT&#40;"SeLockMemoryPrivilege"),&tp.Privileges&#91;0&#93;.Luid&#41;;
	tp.PrivilegeCount=1;
	tp.Privileges&#91;0&#93;.Attributes=SE_PRIVILEGE_ENABLED;
	AdjustTokenPrivileges&#40;TH,FALSE,&tp,0,NULL,0&#41;;
	CloseHandle&#40;TH&#41;;
&#125;
#endif
User avatar
Ozymandias
Posts: 1535
Joined: Sun Oct 25, 2009 2:30 am

Re: How to add LP to CorChess?

Post by Ozymandias »

toppings wrote:MZ's LP code is available on Github.
I didn't see it on Github, but I did on his forum; it is indeed easy to use.
Thx anyway.