QS code

Discussion of chess software programming and technical issues.

Moderator: Ras

Gerd Isenberg
Posts: 2251
Joined: Wed Mar 08, 2006 8:47 pm
Location: Hattingen, Germany

Re: QS code

Post by Gerd Isenberg »

Guetti wrote: Hm, but MVV/LVA is
score = value[captured_piece] - moving_piece;

PxQ > RxQ
meaning (980 - 1) > (980 - 4)
correct?
I see, if that is the "classical" MVV/LVA heuristic, it might be correct.
PxP = 99 > QxP = 96, yes better than nothing if pawn is defended (by other pawn).
User avatar
hgm
Posts: 28353
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: QS code

Post by hgm »

Yes, this is exactly how uMax implements QS. Although it is a bit funny there, as the piece encodings are not in the usual order, and King comes between Knight and Bishop (to make the leapers contiguous in encoding space). So it tries capture with King a bit early. That doesn't really hurt, as capture with King is hardly ever possible. And when the piece was defended, the branch cuts off immediately: there never is a large subtree from such a move. (With a legal move generator only captures with King of undefended pieces would be generated. So a KxB always has a SEE value of +3, while a PxB might have a SEE value of +2...)

In uMax the MVV/LVA determination is implemented as the first IID iteration (QS being the second), where

score = pieceValue[victim] - movingPiece;

without search.
Guetti

Re: QS code

Post by Guetti »

Tony wrote: This all should also give you a change to have alpha-beta "sink in". Once you understand it, it is difficult to understand that you didn't understood it, but till that time it's like hanging on to a train with your fingertips.

Tony
Alpha-beta still holds some mysteries to me. Here is a question for the cracks. I collect the PV like this in the lower part of the ab-search (sorry, no TT yet):

Code: Select all

                // Alpha-Beta 
		if (score > alpha)
		{
			if (score >= beta)
				return beta;
			alpha = score;
			foundPV = true;
	-->		pline.move[0] = c_move;
	-->		for (int j = 0; j < currentline.mvnum; j++)
	-->			pline.move[j+1] = currentline.move[j];
	-->		pline.mvnum = currentline.mvnum + 1;
				
		}

This works fine. However, when I search with an Aspiration-window and have to do a research with the full window I don't get a PV for the first search. In fact, the number of moves in the PV line is 0. I wonder why? Is it a bug in my PV collection or is there a reason for this?


Code: Select all

black(1): Searcher 0 is searching
Juhui, searching!
Trying 10s
 ply   time   score      nodes    pv
   1   0.00     -19        161    b7e4 c3e4 
   2   0.00     -19        804    b7e4 c3e4 h4h3 
   3   0.00     -39       4355    b7e4 c3e4 h4h3 e4d6 
   4   0.05     -39      32155    b7e4 c3e4 h4h3 e4d6 c8c6 
   5   0.16     -39     187368    b7e4 c3e4 h4h3 e4d6 c8c6 a1d1 
>> 6   0.25      11     287773 
   6   0.89      93    1478351    b7e4 c3e4 h4h2 e2h2 g4h2 f1b1 h2f3 e4d6 
   7   3.72      75    7491115    b7e4 c3e4 h4h2 e2h2 g4h2 h1h2 c8c2 h2h1 c2b2 e4d6 

PV: 10 b7e4 c3e4 h4h2 e2h2 g4h2 h1h2 c8c2 h2h1 c2b2 e4d6 
gladius
Posts: 568
Joined: Tue Dec 12, 2006 10:10 am
Full name: Gary Linscott

Re: QS code

Post by gladius »

Score is never going to be above alpha if your root node fails low, so you will never update the root PV. You can do something like this to fix that:

Code: Select all

int bestScore = minEval;

...

if (score > bestScore) {
 if (score >= beta) return beta;
 if (score > alpha) alpha = score;
 foundPV = true; 
 ...
}
return alpha;
This is a fail-hard implementation. Once you maintain a bestScore, you can also use fail-soft, where you return bestScore instead of alpha or beta.
Uri Blass
Posts: 10792
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: QS code

Post by Uri Blass »

Guetti wrote:
Tony wrote: This all should also give you a change to have alpha-beta "sink in". Once you understand it, it is difficult to understand that you didn't understood it, but till that time it's like hanging on to a train with your fingertips.

Tony
Alpha-beta still holds some mysteries to me. Here is a question for the cracks. I collect the PV like this in the lower part of the ab-search (sorry, no TT yet):

Code: Select all

                // Alpha-Beta 
		if (score > alpha)
		{
			if (score >= beta)
				return beta;
			alpha = score;
			foundPV = true;
	-->		pline.move[0] = c_move;
	-->		for (int j = 0; j < currentline.mvnum; j++)
	-->			pline.move[j+1] = currentline.move[j];
	-->		pline.mvnum = currentline.mvnum + 1;
				
		}

This works fine. However, when I search with an Aspiration-window and have to do a research with the full window I don't get a PV for the first search. In fact, the number of moves in the PV line is 0. I wonder why? Is it a bug in my PV collection or is there a reason for this?

Code: Select all

black(1): Searcher 0 is searching
Juhui, searching!
Trying 10s
 ply   time   score      nodes    pv
   1   0.00     -19        161    b7e4 c3e4 
   2   0.00     -19        804    b7e4 c3e4 h4h3 
   3   0.00     -39       4355    b7e4 c3e4 h4h3 e4d6 
   4   0.05     -39      32155    b7e4 c3e4 h4h3 e4d6 c8c6 
   5   0.16     -39     187368    b7e4 c3e4 h4h3 e4d6 c8c6 a1d1 
>> 6   0.25      11     287773 
   6   0.89      93    1478351    b7e4 c3e4 h4h2 e2h2 g4h2 f1b1 h2f3 e4d6 
   7   3.72      75    7491115    b7e4 c3e4 h4h2 e2h2 g4h2 h1h2 c8c2 h2h1 c2b2 e4d6 

PV: 10 b7e4 c3e4 h4h2 e2h2 g4h2 h1h2 c8c2 h2h1 c2b2 e4d6 
The reason is simple
The score is never bigger than alpha and not bigger than beta in the first search.

I start from pv[0][0]=root_moves->moves[0].move; so I always get a pv of at least one ply even if I fail high at root node.

If I fail high on non root move then I do a research and do not update the pv(it may be wrong fail high and I want to remember the old pv) but in this case movei print information that is not in the pv so the user can see the move that movei failed high on it.

Uri
Tony

Re: QS code

Post by Tony »

Guetti wrote:
Tony wrote: This all should also give you a change to have alpha-beta "sink in". Once you understand it, it is difficult to understand that you didn't understood it, but till that time it's like hanging on to a train with your fingertips.

Tony
Alpha-beta still holds some mysteries to me. Here is a question for the cracks. I collect the PV like this in the lower part of the ab-search (sorry, no TT yet):

Code: Select all

                // Alpha-Beta 
		if (score > alpha)
		{
			if (score >= beta)
				return beta;
			alpha = score;
			foundPV = true;
	-->		pline.move[0] = c_move;
	-->		for (int j = 0; j < currentline.mvnum; j++)
	-->			pline.move[j+1] = currentline.move[j];
	-->		pline.mvnum = currentline.mvnum + 1;
				
		}

This works fine. However, when I search with an Aspiration-window and have to do a research with the full window I don't get a PV for the first search. In fact, the number of moves in the PV line is 0. I wonder why? Is it a bug in my PV collection or is there a reason for this?


Code: Select all

black(1): Searcher 0 is searching
Juhui, searching!
Trying 10s
 ply   time   score      nodes    pv
   1   0.00     -19        161    b7e4 c3e4 
   2   0.00     -19        804    b7e4 c3e4 h4h3 
   3   0.00     -39       4355    b7e4 c3e4 h4h3 e4d6 
   4   0.05     -39      32155    b7e4 c3e4 h4h3 e4d6 c8c6 
   5   0.16     -39     187368    b7e4 c3e4 h4h3 e4d6 c8c6 a1d1 
>> 6   0.25      11     287773 
   6   0.89      93    1478351    b7e4 c3e4 h4h2 e2h2 g4h2 f1b1 h2f3 e4d6 
   7   3.72      75    7491115    b7e4 c3e4 h4h2 e2h2 g4h2 h1h2 c8c2 h2h1 c2b2 e4d6 

PV: 10 b7e4 c3e4 h4h2 e2h2 g4h2 h1h2 c8c2 h2h1 c2b2 e4d6 
This is normal beheaviour. Add in Uri's tip and at least it will look OK.

The problem is that on a fail low (or fail high for that matter) you are running on the bounds (ie if I fail low (<=alpha) you fail high (>=beta))

To save time your code only collects a pv when alpha<score<beta so it will not do that in this case.

This is (partly) correct though, ie alpha-beta just knows all your moves suck (below expected score) so what would be the best moves ?

Well, the one that sucks the least, but it only knows that after a research.

Tony
Aleks Peshkov
Posts: 892
Joined: Sun Nov 19, 2006 9:16 pm
Location: Russia

Re: QS code

Post by Aleks Peshkov »

Sometime ago Fabien Letouzey posted a working way to collect PV from zero-window searches.
Guetti

Re: QS code

Post by Guetti »

Aleks Peshkov wrote:Sometime ago Fabien Letouzey posted a working way to collect PV from zero-window searches.
I guess Fabiens method doesn't work for me because I widen the aspiration window in the research to -INFINITY/+INFINITY.

Code: Select all

score = rootsearch(searcher, &rootmb, depth*DFRACT, pvline, alpha, beta);
		
// Aspriration search
#ifdef ASPWINDOW
if ((score <= alpha) || (score >= beta))
{
    // Show search properties of current depth
    printf(">>%2d %6.2f", depth, checktime());
    printf("%8d %10llu    ", score, sstats.nodes);
    for (int i = 0; i < pvline.mvnum; i++)
        movenot(pvline.move[i]);   // prints move
    printf("\n");				
    alpha = -MAXSCORE;
    beta = MAXSCORE;
    score = rootsearch(searcher, &rootmb, depth*DFRACT, pvline, alpha, beta);
}
alpha = score - ASPWINDOW;
beta = score + ASPWINDOW;
#endif
Harald
Posts: 318
Joined: Thu Mar 09, 2006 1:07 am

Re: QS code

Post by Harald »

hgm wrote:Yes, this is exactly how uMax implements QS. Although it is a bit funny there, as the piece encodings are not in the usual order, and King comes between Knight and Bishop (to make the leapers contiguous in encoding space). So it tries capture with King a bit early. That doesn't really hurt, as capture with King is hardly ever possible. And when the piece was defended, the branch cuts off immediately: there never is a large subtree from such a move. (With a legal move generator only captures with King of undefended pieces would be generated. So a KxB always has a SEE value of +3, while a PxB might have a SEE value of +2...)

In uMax the MVV/LVA determination is implemented as the first IID iteration (QS being the second), where

score = pieceValue[victim] - movingPiece;

without search.
In other engines with P = 100 and Q = 900 this could work:
score = pieceValue[victim] - pieceValue[movingPiece] / 10;
or
score = pieceValue[victim] - pieceValue[movingPiece] / 20;
This is independent from the order of the piece numbers.

Harald
Tony

Re: QS code

Post by Tony »

Aleks Peshkov wrote:Sometime ago Fabien Letouzey posted a working way to collect PV from zero-window searches.
Doesn't work for alpha beta/pvs.

With MTD you do (at least) 2 searches. Search 1 would give you (at the last 2 iterations) fe the White moves (ie fail high moves). Search 2, with highered alpha-beta, would give you the Black moves (White fail lows,Blacks fail highs).

Various tricks (hashtable, killer moves etc) give a high(er) chance that the 2 lines searched are actually the same (because they could be different with the same score) most of the times.

But in non MTD you don't search for this turning point, so you don't want to do this second search with a zero-window.

Tony

PS Before the last 2 searches ( so during stabilization), you wouldn't get THE main line, but just A line that is responsible for the fail high/low. There is no garantue it's the best line. (for reasons mentioned before)