QS code

Gerd Isenberg · Post by **Gerd Isenberg** » Mon Feb 18, 2008 10:49 am

Guetti wrote: Hm, but MVV/LVA is
score = value[captured_piece] - moving_piece;

PxQ > RxQ
meaning (980 - 1) > (980 - 4)
correct?

I see, if that is the "classical" MVV/LVA heuristic, it might be correct.
PxP = 99 > QxP = 96, yes better than nothing if pawn is defended (by other pawn).

hgm · Post by **hgm** » Mon Feb 18, 2008 11:00 am

Yes, this is exactly how uMax implements QS. Although it is a bit funny there, as the piece encodings are not in the usual order, and King comes between Knight and Bishop (to make the leapers contiguous in encoding space). So it tries capture with King a bit early. That doesn't really hurt, as capture with King is hardly ever possible. And when the piece was defended, the branch cuts off immediately: there never is a large subtree from such a move. (With a legal move generator only captures with King of undefended pieces would be generated. So a KxB always has a SEE value of +3, while a PxB might have a SEE value of +2...)

In uMax the MVV/LVA determination is implemented as the first IID iteration (QS being the second), where

score = pieceValue[victim] - movingPiece;

without search.

Guetti · Post by **Guetti** » Tue Feb 19, 2008 1:52 am

Tony wrote: This all should also give you a change to have alpha-beta "sink in". Once you understand it, it is difficult to understand that you didn't understood it, but till that time it's like hanging on to a train with your fingertips.

Tony

Alpha-beta still holds some mysteries to me. Here is a question for the cracks. I collect the PV like this in the lower part of the ab-search (sorry, no TT yet):

Code: Select all

                // Alpha-Beta 
		if (score > alpha)
		{
			if (score >= beta)
				return beta;
			alpha = score;
			foundPV = true;
	-->		pline.move[0] = c_move;
	-->		for (int j = 0; j < currentline.mvnum; j++)
	-->			pline.move[j+1] = currentline.move[j];
	-->		pline.mvnum = currentline.mvnum + 1;
				
		}

This works fine. However, when I search with an Aspiration-window and have to do a research with the full window I don't get a PV for the first search. In fact, the number of moves in the PV line is 0. I wonder why? Is it a bug in my PV collection or is there a reason for this?

Code: Select all

black(1): Searcher 0 is searching
Juhui, searching!
Trying 10s
 ply   time   score      nodes    pv
   1   0.00     -19        161    b7e4 c3e4 
   2   0.00     -19        804    b7e4 c3e4 h4h3 
   3   0.00     -39       4355    b7e4 c3e4 h4h3 e4d6 
   4   0.05     -39      32155    b7e4 c3e4 h4h3 e4d6 c8c6 
   5   0.16     -39     187368    b7e4 c3e4 h4h3 e4d6 c8c6 a1d1 
>> 6   0.25      11     287773 
   6   0.89      93    1478351    b7e4 c3e4 h4h2 e2h2 g4h2 f1b1 h2f3 e4d6 
   7   3.72      75    7491115    b7e4 c3e4 h4h2 e2h2 g4h2 h1h2 c8c2 h2h1 c2b2 e4d6 

PV: 10 b7e4 c3e4 h4h2 e2h2 g4h2 h1h2 c8c2 h2h1 c2b2 e4d6

gladius · Post by **gladius** » Tue Feb 19, 2008 6:20 am

Score is never going to be above alpha if your root node fails low, so you will never update the root PV. You can do something like this to fix that:

Code: Select all

int bestScore = minEval;

...

if (score > bestScore) {
 if (score >= beta) return beta;
 if (score > alpha) alpha = score;
 foundPV = true; 
 ...
}
return alpha;

This is a fail-hard implementation. Once you maintain a bestScore, you can also use fail-soft, where you return bestScore instead of alpha or beta.

Uri Blass · Post by **Uri Blass** » Tue Feb 19, 2008 7:51 am

Guetti wrote:
Tony wrote: This all should also give you a change to have alpha-beta "sink in". Once you understand it, it is difficult to understand that you didn't understood it, but till that time it's like hanging on to a train with your fingertips.

Tony
Alpha-beta still holds some mysteries to me. Here is a question for the cracks. I collect the PV like this in the lower part of the ab-search (sorry, no TT yet):
Code: Select all
                // Alpha-Beta 
		if (score > alpha)
		{
			if (score >= beta)
				return beta;
			alpha = score;
			foundPV = true;
	-->		pline.move[0] = c_move;
	-->		for (int j = 0; j < currentline.mvnum; j++)
	-->			pline.move[j+1] = currentline.move[j];
	-->		pline.mvnum = currentline.mvnum + 1;
				
		}
This works fine. However, when I search with an Aspiration-window and have to do a research with the full window I don't get a PV for the first search. In fact, the number of moves in the PV line is 0. I wonder why? Is it a bug in my PV collection or is there a reason for this?
Code: Select all
black(1): Searcher 0 is searching
Juhui, searching!
Trying 10s
 ply   time   score      nodes    pv
   1   0.00     -19        161    b7e4 c3e4 
   2   0.00     -19        804    b7e4 c3e4 h4h3 
   3   0.00     -39       4355    b7e4 c3e4 h4h3 e4d6 
   4   0.05     -39      32155    b7e4 c3e4 h4h3 e4d6 c8c6 
   5   0.16     -39     187368    b7e4 c3e4 h4h3 e4d6 c8c6 a1d1 
>> 6   0.25      11     287773 
   6   0.89      93    1478351    b7e4 c3e4 h4h2 e2h2 g4h2 f1b1 h2f3 e4d6 
   7   3.72      75    7491115    b7e4 c3e4 h4h2 e2h2 g4h2 h1h2 c8c2 h2h1 c2b2 e4d6 

PV: 10 b7e4 c3e4 h4h2 e2h2 g4h2 h1h2 c8c2 h2h1 c2b2 e4d6 

The reason is simple
The score is never bigger than alpha and not bigger than beta in the first search.

I start from pv[0][0]=root_moves->moves[0].move; so I always get a pv of at least one ply even if I fail high at root node.

If I fail high on non root move then I do a research and do not update the pv(it may be wrong fail high and I want to remember the old pv) but in this case movei print information that is not in the pv so the user can see the move that movei failed high on it.

Uri

Tony · Post by **Tony** » Tue Feb 19, 2008 8:10 am

Guetti wrote:
Tony wrote: This all should also give you a change to have alpha-beta "sink in". Once you understand it, it is difficult to understand that you didn't understood it, but till that time it's like hanging on to a train with your fingertips.

Tony
Alpha-beta still holds some mysteries to me. Here is a question for the cracks. I collect the PV like this in the lower part of the ab-search (sorry, no TT yet):
Code: Select all
                // Alpha-Beta 
		if (score > alpha)
		{
			if (score >= beta)
				return beta;
			alpha = score;
			foundPV = true;
	-->		pline.move[0] = c_move;
	-->		for (int j = 0; j < currentline.mvnum; j++)
	-->			pline.move[j+1] = currentline.move[j];
	-->		pline.mvnum = currentline.mvnum + 1;
				
		}
This works fine. However, when I search with an Aspiration-window and have to do a research with the full window I don't get a PV for the first search. In fact, the number of moves in the PV line is 0. I wonder why? Is it a bug in my PV collection or is there a reason for this?
Code: Select all
black(1): Searcher 0 is searching
Juhui, searching!
Trying 10s
 ply   time   score      nodes    pv
   1   0.00     -19        161    b7e4 c3e4 
   2   0.00     -19        804    b7e4 c3e4 h4h3 
   3   0.00     -39       4355    b7e4 c3e4 h4h3 e4d6 
   4   0.05     -39      32155    b7e4 c3e4 h4h3 e4d6 c8c6 
   5   0.16     -39     187368    b7e4 c3e4 h4h3 e4d6 c8c6 a1d1 
>> 6   0.25      11     287773 
   6   0.89      93    1478351    b7e4 c3e4 h4h2 e2h2 g4h2 f1b1 h2f3 e4d6 
   7   3.72      75    7491115    b7e4 c3e4 h4h2 e2h2 g4h2 h1h2 c8c2 h2h1 c2b2 e4d6 

PV: 10 b7e4 c3e4 h4h2 e2h2 g4h2 h1h2 c8c2 h2h1 c2b2 e4d6 

This is normal beheaviour. Add in Uri's tip and at least it will look OK.

The problem is that on a fail low (or fail high for that matter) you are running on the bounds (ie if I fail low (<=alpha) you fail high (>=beta))

To save time your code only collects a pv when alpha<score<beta so it will not do that in this case.

This is (partly) correct though, ie alpha-beta just knows all your moves suck (below expected score) so what would be the best moves ?

Well, the one that sucks the least, but it only knows that after a research.

Tony

Aleks Peshkov · Post by **Aleks Peshkov** » Tue Feb 19, 2008 3:24 pm

Sometime ago Fabien Letouzey posted a working way to collect PV from zero-window searches.

Guetti · Post by **Guetti** » Tue Feb 19, 2008 3:57 pm

Aleks Peshkov wrote:Sometime ago Fabien Letouzey posted a working way to collect PV from zero-window searches.

I guess Fabiens method doesn't work for me because I widen the aspiration window in the research to -INFINITY/+INFINITY.

Code: Select all

score = rootsearch(searcher, &rootmb, depth*DFRACT, pvline, alpha, beta);
		
// Aspriration search
#ifdef ASPWINDOW
if ((score <= alpha) || (score >= beta))
{
    // Show search properties of current depth
    printf(">>%2d %6.2f", depth, checktime());
    printf("%8d %10llu    ", score, sstats.nodes);
    for (int i = 0; i < pvline.mvnum; i++)
        movenot(pvline.move[i]);   // prints move
    printf("\n");				
    alpha = -MAXSCORE;
    beta = MAXSCORE;
    score = rootsearch(searcher, &rootmb, depth*DFRACT, pvline, alpha, beta);
}
alpha = score - ASPWINDOW;
beta = score + ASPWINDOW;
#endif

Harald · Post by **Harald** » Tue Feb 19, 2008 8:31 pm

hgm wrote:Yes, this is exactly how uMax implements QS. Although it is a bit funny there, as the piece encodings are not in the usual order, and King comes between Knight and Bishop (to make the leapers contiguous in encoding space). So it tries capture with King a bit early. That doesn't really hurt, as capture with King is hardly ever possible. And when the piece was defended, the branch cuts off immediately: there never is a large subtree from such a move. (With a legal move generator only captures with King of undefended pieces would be generated. So a KxB always has a SEE value of +3, while a PxB might have a SEE value of +2...)

In uMax the MVV/LVA determination is implemented as the first IID iteration (QS being the second), where

score = pieceValue[victim] - movingPiece;

without search.

In other engines with P = 100 and Q = 900 this could work:
score = pieceValue[victim] - pieceValue[movingPiece] / 10;
or
score = pieceValue[victim] - pieceValue[movingPiece] / 20;
This is independent from the order of the piece numbers.

Harald

Tony · Post by **Tony** » Wed Feb 20, 2008 8:06 am

Aleks Peshkov wrote:Sometime ago Fabien Letouzey posted a working way to collect PV from zero-window searches.

Doesn't work for alpha beta/pvs.

With MTD you do (at least) 2 searches. Search 1 would give you (at the last 2 iterations) fe the White moves (ie fail high moves). Search 2, with highered alpha-beta, would give you the Black moves (White fail lows,Blacks fail highs).

Various tricks (hashtable, killer moves etc) give a high(er) chance that the 2 lines searched are actually the same (because they could be different with the same score) most of the times.

But in non MTD you don't search for this turning point, so you don't want to do this second search with a zero-window.

Tony

PS Before the last 2 searches ( so during stabilization), you wouldn't get THE main line, but just A line that is responsible for the fail high/low. There is no garantue it's the best line. (for reasons mentioned before)

QS code

Re: QS code

Re: QS code

Re: QS code

Re: QS code

Re: QS code

Re: QS code

Re: QS code

Re: QS code

Re: QS code

Re: QS code