Question on standard implementation of PVS+NWS

Discussion of chess software programming and technical issues.

Moderators: hgm, Harvey Williamson, bob

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
rob
Posts: 26
Joined: Sun Jul 20, 2014 1:26 am

Question on standard implementation of PVS+NWS

Post by rob » Thu Mar 19, 2015 4:03 am

I have a question about implementing the "standard" PVS+NWS algorithm that appears in a variety of papers on chess ("Multi-Cut alphabeta-Pruning in Game-Tree Search", Bjornsson, Marsland, for example, where it appears as Algorithm 3). The algorithm is presented like this:

Code: Select all

function PVS(n,d,alpha,beta)
S = Successors(n)
if d <= 0 OR S == &#123;&#125; then
   return f&#40;n&#41;
best = -PVS&#40;n1 an element of S, d-1, -beta, -alpha&#41;
for ni an element of S such that i > 1 do
   if best >= beta then
      return best
   alpha = max&#40;alpha, best&#41;
   v = -NWS&#40;ni, d-1, -alpha&#41;
   if v > alpha AND v < beta then
      v = -PVS&#40;ni, d-1, -beta, -v&#41;
   if v > best then
      best = v
return best

function NWS&#40;n,d,beta&#41;
S = Successors&#40;n&#41;
if d <= 0 OR S = &#123;&#125; then
   return f&#40;n&#41;
best = -INFINITY
for all ni an element of S do
   v = -NWS&#40;ni, d-1, -beta + epsilon&#41;
   if v > best then
      best = v
      if best >= beta then
         return best
return best
In other words, get the minimax value for the first move, then try to prove that all other possible moves are inferior to the first by trying (initially) a null-window search. If the null window search returns that the searched move could be part of the PV (NWS returns a value larger than alpha but less than beta) then search again with a full window so we can get the true minimax value of the move.

I type in the above algorithm (well, actually, I use a slightly modified version, which appears here on the chess programming wiki) and run it a few thousand times to test it, and I see that it always returns the correct minimax score for any given position and collects either an identical PV to a standard AB implementation, or a different PV but with the exact same minimax score.

My concern is the time required to run the algorithm. I have a chess engine that I have written which contains many, many variants of each of the popular search algorithms. For example, I can run AB and then run AB + iterative deepening + insert the PV of the previous iteration as the first variation to try + killer moves + move sorting (winning captures first, etc).

When I run these on the opening chess position, I see this:

(My code for negamax AB, capture the PV, use iterative deepening, do move sorting, collect and use killer moves)

Code: Select all

best_move_nabcpv_id_ms_killer&#40;...) called with depth = 6
   About to start iterative deepening...
      Depth = 1, Number of evaluations&#58; 20
Depth = 1
   PV length = 1
   PV = 
	Move&#58; WKnight	b1 -> c3
      Depth = 2, Number of evaluations&#58; 48
Depth = 2
   PV length = 2
   PV = 
	Move&#58; WKnight	b1 -> c3
	Move&#58; BKnight	b8 -> c6
      Depth = 3, Number of evaluations&#58; 707
Depth = 3
   PV length = 3
   PV = 
	Move&#58; WKnight	b1 -> c3
	Move&#58; BKnight	b8 -> c6
	Move&#58; WKnight	g1 -> f3
      Depth = 4, Number of evaluations&#58; 1,902
Depth = 4
   PV length = 4
   PV = 
	Move&#58; WKnight	b1 -> c3
	Move&#58; BKnight	b8 -> c6
	Move&#58; WKnight	g1 -> f3
	Move&#58; BKnight	g8 -> f6
      Depth = 5, Number of evaluations&#58; 53,625
Depth = 5
   PV length = 5
   PV = 
	Move&#58; WPawn	e2 -> e4
	Move&#58; BKnight	b8 -> c6
	Move&#58; WBishop	f1 -> b5
	Move&#58; BKnight	c6 -> d4
	Move&#58; WBishop	b5 -> d7
      Depth = 6, Number of evaluations&#58; 677,091
Depth = 6
   PV length = 6
   PV = 
	Move&#58; WPawn	e2 -> e3
	Move&#58; BPawn	e7 -> e5
	Move&#58; WKnight	g1 -> e2
	Move&#58; BBishop	f8 -> b4
	Move&#58; WPawn	e3 -> e4
	Move&#58; BBishop	b4 -> d2
   Number of evaluations&#58; 733,393
   Best move&#58; WPawn	e2 -> e3
   Score&#58; -945.000
   Elapsed time&#58; 4.006 seconds
   Evaluations per second&#58; 183,066
Similarly, for PVS with null window, capturing the PV, using move sorting, and killer moves, I see (note: no iterative deepening):

Code: Select all

best_move_npvs_nws_cpv_ms_killer&#40;...) called with depth = 6
   About to start iterative deepening...
      Depth = 1, Number of evaluations&#58; 1,919,912
Depth = 1
   PV length = 6
   PV = 
	Move&#58; WPawn	e2 -> e3
	Move&#58; BPawn	e7 -> e5
	Move&#58; WKnight	g1 -> e2
	Move&#58; BBishop	f8 -> b4
	Move&#58; WPawn	e3 -> e4
	Move&#58; BBishop	b4 -> d2
   Number of evaluations&#58; 1,919,912
   Best move&#58; WPawn	e2 -> e3
   Score&#58; -945.000
   Elapsed time&#58; 10.406 seconds
   Evaluations per second&#58; 184,497
When I run my code for "standard" PVS+NWS, I see that it performs 3 times as many evaluations as AB:

Code: Select all

best_move_npvs_nws_cpv_id_ms_killer&#40;...) called with depth = 6
   About to start iterative deepening...
      Depth = 1, Number of evaluations&#58; 23
      Number of unique evaluations&#58; Searched 20 unique positions
      FWS evaluations&#58; 4
      ZWS evaluations&#58; 19
Depth = 1
   PV length = 1
   PV = 
	Move&#58; WKnight	b1 -> c3
      Depth = 2, Number of evaluations&#58; 97
      Number of unique evaluations&#58; Searched 94 unique positions
      FWS evaluations&#58; 4
      ZWS evaluations&#58; 93
Depth = 2
   PV length = 2
   PV = 
	Move&#58; WKnight	b1 -> c3
	Move&#58; BKnight	b8 -> c6
      Depth = 3, Number of evaluations&#58; 907
      Number of unique evaluations&#58; Searched 904 unique positions
      FWS evaluations&#58; 4
      ZWS evaluations&#58; 903
Depth = 3
   PV length = 3
   PV = 
	Move&#58; WKnight	b1 -> c3
	Move&#58; BKnight	b8 -> c6
	Move&#58; WKnight	g1 -> f3
      Depth = 4, Number of evaluations&#58; 4,954
      Number of unique evaluations&#58; Searched 4266 unique positions
      FWS evaluations&#58; 4
      ZWS evaluations&#58; 4950
Depth = 4
   PV length = 4
   PV = 
	Move&#58; WKnight	b1 -> c3
	Move&#58; BKnight	b8 -> c6
	Move&#58; WKnight	g1 -> f3
	Move&#58; BKnight	g8 -> f6
      Depth = 5, Number of evaluations&#58; 298,875
      Number of unique evaluations&#58; Searched 159,718 unique positions
      FWS evaluations&#58; 5254
      ZWS evaluations&#58; 293621
Depth = 5
   PV length = 5
   PV = 
	Move&#58; WPawn	e2 -> e4
	Move&#58; BKnight	b8 -> c6
	Move&#58; WBishop	f1 -> b5
	Move&#58; BKnight	c6 -> d4
	Move&#58; WBishop	b5 -> d7
      Depth = 6, Number of evaluations&#58; 2,528,551
      FWS evaluations&#58; 43153
      ZWS evaluations&#58; 2485398
Depth = 6
   PV length = 6
   PV = 
	Move&#58; WPawn	e2 -> e3
	Move&#58; BPawn	e7 -> e5
	Move&#58; WKnight	g1 -> e2
	Move&#58; BBishop	f8 -> b4
	Move&#58; WPawn	e3 -> e4
	Move&#58; BBishop	b4 -> d2
   Number of evaluations&#58; 2,833,407
   Best move&#58; WPawn	e2 -> e3
   Score&#58; -945.000
   Elapsed time&#58; 15.060 seconds
   Evaluations per second&#58; 188,141
The main speed problem, as can be seen, is that the NWS (ZWS) is performing a tremendous number of evaluations, so that this "canonical" implementation is actually much, much slower than alpha-beta. Note that there is not too much duplicate searching of the same position. For example, in the search to depth 5, the evaluation function is called 298,875 times and the number of unique positions passed into the evaluation function is 159,718 positions. The same position is searched about twice on average. Note that I collect unique positions on a different run of the search because collecting and comparing positions for uniqueness on 300,000 positions is time consuming.

My question is, is there anything I can do in the NWS to reduce the number of evaluations? I know I can add a null move heuristic, I can further add a forward pruning mechanism such as multi-cut, and I can further add transposition tables to speed up searches of positions that have been previously evaluated using the evaluation function.

But can I perform move ordering inside the NWS to any benefit? It seems that, by definition, I cannot use the PV of the previous iteration inside the NWS, since the PV is searched in the full-width recursive call to PVS. I can clearly do some sort of move sorting (winning captures first, for example). Perhaps I can also do killer moves (I haven't looked at doing killer moves inside the NWS carefully as yet, however). Perhaps I could also do some type of iterative deepening (is this what is meant by "internal" iterative deepening?).

Is there some other move ordering option or something else outside of null-move, multi-cut and/or transposition tables that could be added to the NWS that would improve execution time significantly? I'm also aware of Extended Futility Pruning but haven't even started reading that paper yet.

Thank you very kindly,

Rob

Rein Halbersma
Posts: 679
Joined: Tue May 22, 2007 9:13 am

Re: Question on standard implementation of PVS+NWS

Post by Rein Halbersma » Thu Mar 19, 2015 6:45 am

Add a transposition table, aka a hash table. Otherwise the re-searches of internal deepening, pvs, null move, lmr and other enhancements over plain alpha-beta will indeed kill performance.

rob
Posts: 26
Joined: Sun Jul 20, 2014 1:26 am

Re: Question on standard implementation of PVS+NWS

Post by rob » Thu Mar 19, 2015 7:33 am

As you can see from the data in my post, adding a transposition table would, on average, and very roughly, double the performance. But performance would need to quadruple to match what I see in my AB implementation.

This leads me to believe that move ordering is very important in the zero window search of standard PVS+ZWS. The chess programming literature mentions that a proper implementation of PVS is about 10% faster than its AB counterpart.

Rob

Sven
Posts: 3743
Joined: Thu May 15, 2008 7:57 pm
Location: Berlin, Germany
Full name: Sven Schüle
Contact:

Re: Question on standard implementation of PVS+NWS

Post by Sven » Thu Mar 19, 2015 8:15 am

rob wrote:As you can see from the data in my post, adding a transposition table would, on average, and very roughly, double the performance. But performance would need to quadruple to match what I see in my AB implementation.

This leads me to believe that move ordering is very important in the zero window search of standard PVS+ZWS. The chess programming literature mentions that a proper implementation of PVS is about 10% faster than its AB counterpart.

Rob
The improvement of PVS over plain AB will be less without iterative deepening, without TT, and of course without proper move ordering. Also you need move ordering in NWS (ZWS) as well.

kinderchocolate
Posts: 406
Joined: Mon Nov 01, 2010 5:55 am
Full name: Ted Wong
Contact:

Re: Question on standard implementation of PVS+NWS

Post by kinderchocolate » Thu Mar 19, 2015 9:04 am

In chess programming, the little components are not independent. In an academic paper, it's assumed that improvement can be achieved in a perfect environment, so you'll have to be a little careful.

I don't know what ratio of improvement that you should expect, and I can't remember how much I got when I did it. But as long as you get something like a two-fold, you should be happy because at least you're doing the right thing. You might need to check other areas such as move ordering.

brtzsnr
Posts: 433
Joined: Fri Jan 16, 2015 3:02 pm
Contact:

Re: Question on standard implementation of PVS+NWS

Post by brtzsnr » Thu Mar 19, 2015 11:20 am

rob wrote: This leads me to believe that move ordering is very important in the zero window search of standard PVS+ZWS. The chess programming literature mentions that a proper implementation of PVS is about 10% faster than its AB counterpart.
Transposition tables improve move ordering and for PVS/LMR they will help even more because same trees will be researched in case of a fail-high on the null-window.

rob
Posts: 26
Joined: Sun Jul 20, 2014 1:26 am

Re: Question on standard implementation of PVS+NWS

Post by rob » Thu Mar 19, 2015 9:04 pm

What move ordering is most effective in the ZWS?

Can iterative deepening be applied in the ZWS?

Rob

kbhearn
Posts: 411
Joined: Thu Dec 30, 2010 3:48 am

Re: Question on standard implementation of PVS+NWS

Post by kbhearn » Thu Mar 19, 2015 9:30 pm

When doing PVS you're looking at a tradeoff of saving a little bit of nodes through extra cutoffs if you're right that the current PV is better than the move you're scouting compared to doing strictly more work than plain alpha beta if you're wrong. Accordingly the important thing is your first PV move must be better than the vast majority of moves you scout. The most reliable way to reach that is to use a TT and the first move chosen is the best move from last iteration. Furthermore when you are wrong the bounds and cut moves obtained from the scout search will be stored in the TT and at least somewhat be able to be reused in the open window research.

rob
Posts: 26
Joined: Sun Jul 20, 2014 1:26 am

Re: Question on standard implementation of PVS+NWS

Post by rob » Thu Mar 19, 2015 11:40 pm

kbhearn wrote:Accordingly the important thing is your first PV move must be better than the vast majority of moves you scout.
This is certainly true of the PVS search. But what kind of move ordering can I do in the ZWS? The vast majority of board evaluations originate from the ZWS and, at the moment, I'm not doing any move ordering, killer moves, or iterative deepening in the ZWS.

Can anyone describe how they implemented move ordering in the ZWS? I simply don't know a priori what would be most effective. Another issue is can iterative deepening be used in the ZWS?

Thanks,

Rob

AlvaroBegue
Posts: 912
Joined: Tue Mar 09, 2010 2:46 pm
Location: New York
Full name: Álvaro Begué (RuyDos)

Re: Question on standard implementation of PVS+NWS

Post by AlvaroBegue » Fri Mar 20, 2015 12:36 am

You can use the standard set of tricks:
* Hash move
* Non-losing captures (according to SEE), in MVV/LVA order.
* Killer moves
* Quiet moves in history-heuristic order.
* Losing captures.

Post Reply