Speculative prefetch

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

petero2
Posts: 688
Joined: Mon Apr 19, 2010 7:07 pm
Location: Sweden
Full name: Peter Osterlund

Speculative prefetch

Post by petero2 »

A while ago I implemented speculative prefetch in texel. The idea is to issue a prefetch instruction for the next needed transposition table probe before the move legality test in the parent search node. The prefetch is speculative for two reasons. First, if the move legality test fails, the prefetch is useless. Second, the computation of the zobrist hash key after the next move deliberately ignores special moves like castling, en passant and promotions, which means that the prefetch will load the wrong cache line for those moves.

In texel I measured a +6 elo increase at hyper bullet time control when using this optimization.

I decided to see if this optimization would work also in stockfish. Using the patch below, I measured a 2.8% speed increase for "stockfish bench". Before the change the average bench time was 3710ms +/- 5ms. After the change the average bench time was 3607ms +/- 7ms.

I have not played any games to verify this patch, but since it only affects speed it should also give an elo increase.

My computer is an Intel Core i7 870 @ 2.93GHz and my compiler is gcc 4.7.2. It would be interesting to know if there is a speed increase also for other computer/compiler combinations.

Code: Select all

diff --git a/src/position.cpp b/src/position.cpp
index db4c857..ec59e8d 100644
--- a/src/position.cpp
+++ b/src/position.cpp
@@ -1267,3 +1267,16 @@ bool Position::pos_is_ok(int* step) const {
 
   return true;
 }
+
+Key Position::hash_after_move(Move m) const {
+  int from = from_sq(m);
+  int to = to_sq(m);
+  Piece p = board[from];
+  Piece capP = board[to];
+  Key ret = st->key ^ Zobrist::side;
+  if (capP != NO_PIECE)
+    ret ^= Zobrist::psq[color_of(capP)][type_of(capP)][to];
+  ret ^= Zobrist::psq[color_of(p)][type_of(p)][to];
+  ret ^= Zobrist::psq[color_of(p)][type_of(p)][from];
+  return ret;
+}
diff --git a/src/position.h b/src/position.h
index 8f5fb75..7565abb 100644
--- a/src/position.h
+++ b/src/position.h
@@ -139,6 +139,7 @@ public:
   void undo_move(Move m);
   void do_null_move(StateInfo& st);
   void undo_null_move();
+  Key hash_after_move(Move m) const;
 
   // Static exchange evaluation
   Value see(Move m) const;
diff --git a/src/search.cpp b/src/search.cpp
index 6215b08..7cabff6 100644
--- a/src/search.cpp
+++ b/src/search.cpp
@@ -796,6 +796,8 @@ moves_loop: // When in check and at SpNode search starts from here
           }
       }
 
+      prefetch((char*)TT.first_entry(pos.hash_after_move(move)));
+
       // Check for legality just before making the move
       if (!RootNode && !SpNode && !pos.legal(move, ci.pinned))
       {
@@ -1145,6 +1147,8 @@ moves_loop: // When in check and at SpNode search starts from here
           &&  pos.see_sign&#40;move&#41; < VALUE_ZERO&#41;
           continue;
 
+      prefetch&#40;&#40;char*&#41;TT.first_entry&#40;pos.hash_after_move&#40;move&#41;));
+
       // Check for legality just before making the move
       if (!pos.legal&#40;move, ci.pinned&#41;)
           continue;
Joerg Oster
Posts: 937
Joined: Fri Mar 10, 2006 4:29 pm
Location: Germany

Re: Speculative prefetch

Post by Joerg Oster »

Hi Peter,

thank you for sharing.
On my system, AMD Bulldozer FX-8120, gcc 4.8.2, Linux Mint 17, I only get a very tiny speedup.

Code: Select all

sudo nice -n -20 taskset -c 0 ./stockfish bench 32 1 17 >/dev/null

SF default
===========================
Total time &#40;ms&#41; &#58; 29620
Nodes searched  &#58; 48273675
Nodes/second    &#58; 1629766


===========================
Total time &#40;ms&#41; &#58; 29655
Nodes searched  &#58; 48273675
Nodes/second    &#58; 1627842


===========================
Total time &#40;ms&#41; &#58; 29527                   ø 29601 
Nodes searched  &#58; 48273675
Nodes/second    &#58; 1634899



SF prefetch
===========================
Total time &#40;ms&#41; &#58; 29413
Nodes searched  &#58; 48273675
Nodes/second    &#58; 1641236


===========================
Total time &#40;ms&#41; &#58; 29419
Nodes searched  &#58; 48273675
Nodes/second    &#58; 1640901


===========================
Total time &#40;ms&#41; &#58; 29428                   ø 29420    +0.6%
Nodes searched  &#58; 48273675
Nodes/second    &#58; 1640399
Jörg Oster
flok

Re: Speculative prefetch

Post by flok »

Hi,
Joerg Oster wrote:

Code: Select all

sudo nice -n -20 taskset -c 0
I see that you used "nice -n -20" for invoking stockfish. Have you measured how much that helps?
Joerg Oster
Posts: 937
Joined: Fri Mar 10, 2006 4:29 pm
Location: Germany

Re: Speculative prefetch

Post by Joerg Oster »

I have just pushed a test in fishtest to see how it does. I hope you don't mind. :)
Jörg Oster
User avatar
lucasart
Posts: 3232
Joined: Mon May 31, 2010 1:29 pm
Full name: lucasart

Re: Speculative prefetch

Post by lucasart »

Joerg Oster wrote:I have just pushed a test in fishtest to see how it does. I hope you don't mind. :)
SHouldn't you remove the existing prefetch then? You're comulating the prefetch before and after legal move checking basically. I wonder if there's any cost in redoing a prefetch.
Theory and practice sometimes clash. And when that happens, theory loses. Every single time.
Joerg Oster
Posts: 937
Joined: Fri Mar 10, 2006 4:29 pm
Location: Germany

Re: Speculative prefetch

Post by Joerg Oster »

lucasart wrote:
Joerg Oster wrote:I have just pushed a test in fishtest to see how it does. I hope you don't mind. :)
SHouldn't you remove the existing prefetch then? You're comulating the prefetch before and after legal move checking basically. I wonder if there's any cost in redoing a prefetch.
I thought this to be a speculative prefetch additional to the existing one in pos.do_move ...
Peter may correct me if I'm wrong.
Jörg Oster
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Speculative prefetch

Post by bob »

flok wrote:Hi,
Joerg Oster wrote:

Code: Select all

sudo nice -n -20 taskset -c 0
I see that you used "nice -n -20" for invoking stockfish. Have you measured how much that helps?
Unless you are running other things, it helps zero. And in fact, it would be better to simply suspend things that are running by sending those processes a SIGSTOP while running if max performance is important. nice -20 doesn't guarantee you 100% of the CPU. It just runs your quantum up to something very large so that you get a large percentage (maybe 95% or so at -20, 5% or so at +19).
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Speculative prefetch

Post by bob »

lucasart wrote:
Joerg Oster wrote:I have just pushed a test in fishtest to see how it does. I hope you don't mind. :)
SHouldn't you remove the existing prefetch then? You're comulating the prefetch before and after legal move checking basically. I wonder if there's any cost in redoing a prefetch.
There is not if it is to the same address, with one exception. If you pre-fetch X, then access something else that replaces X, then pre-fetch X again, the first was wasted and introduced overhead.
Joerg Oster
Posts: 937
Joined: Fri Mar 10, 2006 4:29 pm
Location: Germany

Re: Speculative prefetch

Post by Joerg Oster »

On my Intel i5-4570@3.20GHz I measure a speed improvement of about 2.2%. Nice.
Jörg Oster
Joerg Oster
Posts: 937
Joined: Fri Mar 10, 2006 4:29 pm
Location: Germany

Re: Speculative prefetch

Post by Joerg Oster »

flok wrote:Hi,
Joerg Oster wrote:

Code: Select all

sudo nice -n -20 taskset -c 0
I see that you used "nice -n -20" for invoking stockfish. Have you measured how much that helps?
No.
Jörg Oster