Triangular PV question

maksimKorzh · Post by **maksimKorzh** » Sat Sep 12, 2020 8:19 pm

Hi guys.

I was recording a video on collecting a principal variation based on implementation taken from TSCP and faced a thing that has confused me...
So here are the highlights of implementation:

Code: Select all

// GLOBALS

// PV length
int pv_length[64];

// PV table
int pv_table[64][64];

...

int quiescence(int alpha, int beta)
{
    pv_length[ply] = ply;

   ....

   if (score > alpha)
   {
            // store PV move into a PV table
            pv_table[ply][ply] = move_list->moves[count];
            
            // loop over the next ply
            for (int next_ply = ply + 1; next_ply < pv_length[ply + 1]; next_ply++)
                // copy move from next ply to the current line
                pv_table[ply][next_ply] = pv_table[next_ply][next_ply];
            
            // adjust PV length for the current ply
            pv_length[ply] = pv_length[ply + 1];
   }
}

int alphabeta(int alpha, int beta)
{
    pv_length[ply] = ply;

   ....

   if (score > alpha)
   {
            // store PV move into a PV table
            pv_table[ply][ply] = move_list->moves[count];
            
            // loop over the next ply
            for (int next_ply = ply + 1; next_ply < pv_length[ply + 1]; next_ply++)
                // copy move from next ply to the current line
                pv_table[ply][next_ply] = pv_table[next_ply][next_ply];
            
            // adjust PV length for the current ply
            pv_length[ply] = pv_length[ply + 1];
   }
}

Now assuming that pv_table is a two dimensional array I decided to print it just like a chess board:

Code: Select all

    
    for (int i = 0; i < 64; i++)
    {
        for (int y = 0; y < 64; y++)
        {
            printf("%d", pv_table[i][y] ? 1 : 0);
        }
        
        printf("\n");
    }

And what CONFUSES me is the following - I've tested 2 versions of my engine:
1. ID + PV/MVVLVA/killer/history move ordering
2. ID + PV/MVVLVA/killer/history move ordering + PVS + LMR + Null move pruning

now the move map (moves are marked as "1" empty squares by "0")

assuming search depth 5 first version (ID + PV/MVVLVA/killer/history move ordering + PVS + LMR + Null move pruning) gives:

Code: Select all

info score cp 30 depth 1 nodes 736 pv e2a6 b4c3 
info score cp 30 depth 2 nodes 2433 pv e2a6 b4c3 
info score cp 15 depth 3 nodes 9199 pv e2a6 b4c3 b2c3 e6d5 
info score cp 15 depth 4 nodes 47514 pv e2a6 b4c3 b2c3 e6d5 
info score cp 0 depth 5 nodes 132375 pv e2a6 e6d5 c3d5 b6d5 e4d5 e7e5 
bestmove e2a6
1111110000000000000000000000000000000000000000000000000000000000
0111110000000000000000000000000000000000000000000000000000000000
0011110000000000000000000000000000000000000000000000000000000000
0001110000000000000000000000000000000000000000000000000000000000
0000110000000000000000000000000000000000000000000000000000000000
0000010000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000





...

version ( ID + PV/MVVLVA/killer/history move ordering) gives something like:

Code: Select all

info score cp 30 depth 1 nodes 774 pv e2a6 b4c3 
info score cp 30 depth 2 nodes 2919 pv e2a6 b4c3 
info score cp 15 depth 3 nodes 13444 pv e2a6 b4c3 b2c3 e6d5 
info score cp 15 depth 4 nodes 88274 pv e2a6 b4c3 b2c3 e6d5 
info score cp 0 depth 5 nodes 372537 pv e2a6 e6d5 c3d5 b6d5 e4d5 e7e5 
bestmove e2a6
1111110000000000000000000000000000000000000000000000000000000000
0111111111000000000000000000000000000000000000000000000000000000
0011111111110000000000000000000000000000000000000000000000000000
0001111111111100000000000000000000000000000000000000000000000000
0000111111111100000000000000000000000000000000000000000000000000
0000011111111100000000000000000000000000000000000000000000000000
0000001111111111000000000000000000000000000000000000000000000000
0000000111111111000000000000000000000000000000000000000000000000
0000000011111111000000000000000000000000000000000000000000000000
0000000001111111000000000000000000000000000000000000000000000000
0000000000111111000000000000000000000000000000000000000000000000
0000000000011111000000000000000000000000000000000000000000000000
0000000000001111110000000000000000000000000000000000000000000000
0000000000000111110000000000000000000000000000000000000000000000
0000000000000011110000000000000000000000000000000000000000000000
0000000000000001110000000000000000000000000000000000000000000000
0000000000000000110000000000000000000000000000000000000000000000
0000000000000000010000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000




...

The point is I have MORE moves stored into PV than expected (while they are not printed)
I print moves like TSCP does:

Code: Select all

    // loop over the moves within PV
    for (int count = 0; count < pv_length[0]; count++)
    {
        // print PV move
        print_move(pv_table[0][count]);
        printf(" ");
    }

I don't understand the reasons behind this behavior.
Can someone please kindly explain me why is this happening?

Harald · Post by **Harald** » Sat Sep 12, 2020 9:25 pm

May be

Code: Select all

            // loop over the next ply
            for (int next_ply = ply + 1; next_ply < pv_length[ply + 1]; next_ply++)
                // copy move from next ply to the current line
                pv_table[ply][next_ply] = pv_table[ply + 1][next_ply];

maksimKorzh · Post by **maksimKorzh** » Sat Sep 12, 2020 9:37 pm

Harald wrote: ↑Sat Sep 12, 2020 9:25 pm May be

Code: Select all

            // loop over the next ply
            for (int next_ply = ply + 1; next_ply < pv_length[ply + 1]; next_ply++)
                // copy move from next ply to the current line
                pv_table[ply][next_ply] = pv_table[ply + 1][next_ply];

No. This code only makes copy of next ply move to the current ply line.

Moves are written here:
pv_table[ply][ply] = move

Now I can guess why there are zeros - probably due to the beta cutoffs but the matter of extra moves being added is the matter if writing pv moves at deeper plies, most likely within quiescence but why this doesn't happen in a stronger version is a mystery to me. The problem is I don't know how it should be.

Harald · Post by **Harald** » Sat Sep 12, 2020 10:21 pm

"This code only makes copy of next ply move to the current ply line."

My version copies the best PV from the deeper ply back to the current ply
and your version does not. Compare the versions, please.

maksimKorzh · Post by **maksimKorzh** » Sat Sep 12, 2020 10:28 pm

Harald wrote: ↑Sat Sep 12, 2020 10:21 pm "This code only makes copy of next ply move to the current ply line."

My version copies the best PV from the deeper ply back to the current ply
and your version does not. Compare the versions, please.

By saying "This code only makes copy of next ply move to the current ply line." I meant exactly "My version copies the best PV from the deeper ply back to the current ply" so why do you say "your version does not"? (it's not my version, it's TSCP's version implemented THE SAME way as in TSCP)

maksimKorzh · Post by **maksimKorzh** » Sat Sep 12, 2020 10:31 pm

maksimKorzh wrote: ↑Sat Sep 12, 2020 10:28 pm
Harald wrote: ↑Sat Sep 12, 2020 10:21 pm "This code only makes copy of next ply move to the current ply line."

My version copies the best PV from the deeper ply back to the current ply
and your version does not. Compare the versions, please.
By saying "This code only makes copy of next ply move to the current ply line." I meant exactly "My version copies the best PV from the deeper ply back to the current ply" so why do you say "your version does not"? (it's not my version, it's TSCP's version implemented THE SAME way as in TSCP)

PV collection works, it's not a problem. My initial question was WHY the map of PV table is different assuming 2 versions of my engine (stronger and weaker). I don't even know if it's a problem or not.

Harald · Post by **Harald** » Sat Sep 12, 2020 10:38 pm

May be I am just tired. Good night!

Harald · Post by **Harald** » Sun Sep 13, 2020 8:45 pm

Here is my next try. Just from memory. I hope that helps.

Code: Select all

This is how you could implement principle variation moves (pv moves).

you need a global 2-dimensional array (for a maximum search depth of 100)
int pv_moves[100][100];
and a global array of pv_length for each ply
int pv_length[100];
and / or a constant stop marker
const int no_move = 0;
The pv_moves will store the final pv of the search from the root position
in ply 0 (row 0) and temporary pvs starting from deeper plies during the search in other rows
each starting in the diagonal pv_moves[ply][ply].
We only use the upper right triangle of the pv_moves matrix.

Whenever we enter the search function for a ply in our recursive search set
pv_moves[ply][ply] = no_move;
pv_length[ply] = 0;

Whenever we found a new best valid move (score > alpha) in ply set
pv_moves[ply][ply] = best_move;
and copy back the deeper pv that answers that move
for (int m = 0; m < pv_length[ply + 1]; ++m)
    pv_moves[ply][ply + 1 + m] = pv_moves[ply + 1][ply + 1 + m];
and adjust the pv length
pv_length[ply] = 1 + pv_length[ply + 1];

After the search we find the final pv of the search from the root position
in pv_moves[0][0] to pv_moves[0][pv_length[0] - 1]



And now the same with the pv_move as main sorting method in an iterating deepening search
where you use the pv from the last iteration to try these moves first in the next iteration

Before the whole search set
pv_moves[0][0] = no_move;
pv_length[0] = 0;

You need more global variables.
Before each search iteration set
bool in_pv = true;
int last_pv_length = pv_length[0];

Whenever we enter the search function for a ply in our recursive search
set a local variable
int last_pv_move = no_move;
and try to use it from the last iteration
if (in_pv && ply < last_pv_length)
    last_pv_move = pv_moves[0][ply];

Then reset the pv for the current search
pv_moves[ply][ply] = no_move;
pv_length[ply] = 0;

Now we can start with a pv move if there is one
if (last_pv_move != no_move)
    Make, search and take back (*) that move and then handle
    best moves, alpha, beta and the pv_moves and pv_length arrays if this is the new best move
    just like above.
    (*) When you take back that move set
    in_pv = false;

Now generate the move list and go through that list as usual.
You can skip the last_pv_move.
    if (move == last_pv_move)
        continue;

    Make, search and take back (*) the current move and then handle
    best moves, alpha, beta and the pv_moves and pv_length arrays if this is the new best move
    just like above.
    (*) Whenever you take back a valid move set
    in_pv = false;

After each search iteration we find the final pv of that search iteration from the root position
in pv_moves[0][0] to pv_moves[0][pv_length[0] - 1]

maksimKorzh · Post by **maksimKorzh** » Sun Sep 13, 2020 9:56 pm

Harald wrote: ↑Sun Sep 13, 2020 8:45 pm Here is my next try. Just from memory. I hope that helps.

Code: Select all

This is how you could implement principle variation moves (pv moves).

you need a global 2-dimensional array (for a maximum search depth of 100)
int pv_moves[100][100];
and a global array of pv_length for each ply
int pv_length[100];
and / or a constant stop marker
const int no_move = 0;
The pv_moves will store the final pv of the search from the root position
in ply 0 (row 0) and temporary pvs starting from deeper plies during the search in other rows
each starting in the diagonal pv_moves[ply][ply].
We only use the upper right triangle of the pv_moves matrix.

Whenever we enter the search function for a ply in our recursive search set
pv_moves[ply][ply] = no_move;
pv_length[ply] = 0;

Whenever we found a new best valid move (score > alpha) in ply set
pv_moves[ply][ply] = best_move;
and copy back the deeper pv that answers that move
for (int m = 0; m < pv_length[ply + 1]; ++m)
    pv_moves[ply][ply + 1 + m] = pv_moves[ply + 1][ply + 1 + m];
and adjust the pv length
pv_length[ply] = 1 + pv_length[ply + 1];

After the search we find the final pv of the search from the root position
in pv_moves[0][0] to pv_moves[0][pv_length[0] - 1]



And now the same with the pv_move as main sorting method in an iterating deepening search
where you use the pv from the last iteration to try these moves first in the next iteration

Before the whole search set
pv_moves[0][0] = no_move;
pv_length[0] = 0;

You need more global variables.
Before each search iteration set
bool in_pv = true;
int last_pv_length = pv_length[0];

Whenever we enter the search function for a ply in our recursive search
set a local variable
int last_pv_move = no_move;
and try to use it from the last iteration
if (in_pv && ply < last_pv_length)
    last_pv_move = pv_moves[0][ply];

Then reset the pv for the current search
pv_moves[ply][ply] = no_move;
pv_length[ply] = 0;

Now we can start with a pv move if there is one
if (last_pv_move != no_move)
    Make, search and take back (*) that move and then handle
    best moves, alpha, beta and the pv_moves and pv_length arrays if this is the new best move
    just like above.
    (*) When you take back that move set
    in_pv = false;

Now generate the move list and go through that list as usual.
You can skip the last_pv_move.
    if (move == last_pv_move)
        continue;

    Make, search and take back (*) the current move and then handle
    best moves, alpha, beta and the pv_moves and pv_length arrays if this is the new best move
    just like above.
    (*) Whenever you take back a valid move set
    in_pv = false;

After each search iteration we find the final pv of that search iteration from the root position
in pv_moves[0][0] to pv_moves[0][pv_length[0] - 1]

re: first part
- it's exactly the way how it's implemented in TSCP, I've reused that in my engine. This is clear.

re: second part (sorting PV)
- Very interesting but I'm too dumb to get it (after several times rereading) could you please draft alphabeta function with sorting PV?
I can't exactly understand the reason behind

Code: Select all

if (in_pv && ply < last_pv_length)
    last_pv_move = pv_moves[0][ply];

TSCP does that a bit different:

Code: Select all

if (follow_pv)
  sort_pv();

void sort_pv()
{
        follow_pv = FALSE;
	for(i = first_move[ply]; i < first_move[ply + 1]; ++i)
		if (gen_dat[i].m.u == pv[0][ply].u) {
			follow_pv = TRUE;
			gen_dat[i].score += 10000000;
			return;
		}
}

and I still can't understand how switching in_pv (follow_pv) is switched, I mean I see all the places it has been changed in the search but can't understand WHY.
Learning via READING is so difficult for me. Why only dumbs like me making videos? Why guys like you don't have chess programming Youtube channels explaining this stuff interactively? (rhetoric questopn )

Harald · Post by **Harald** » Sun Sep 13, 2020 10:24 pm

I am not even sure if I did it right.

I can't exactly understand the reason behind
Code: Select all
if (in_pv && ply < last_pv_length)
    last_pv_move = pv_moves[0][ply];
and I still can't understand how switching in_pv (follow_pv) is switched,
I mean I see all the places it has been changed in the search but can't understand WHY.

My idea is, just reuse the ply = 0 row from the diagonal pv table from a previous iteration
while it is still valid at the very beginning of the current iteration. Another idea would be
to copy that pv line to an extra last_pv array. But I think between the begin of the current iteration
and the initialisation of the current pv_move[ply][ply] = no_move there is just a small gap in time and code
where we can pick the pv_move from the last iteration and store it in local last_pv_move
for later use in this search function body. We just must be sure that we are at the leftmost
branch of our search tree (in_pv == true) and that a long enough pv sequence from the last
iteration exists (ply < last_pv_length). After we exausted the last pv sequence we can
no longer access it and have to use other move order techniques. The easiest way to switch off
the pv method is to just switch it off (in_pv = false) each time we come back from
any search tree branch without any further test. Even if we switch it off millions of times.

And by the way, instead of making the pv move before the move generation you can also
use only your normal move list and just give the last_pv_move a better sort score (+100000).

Triangular PV question

Triangular PV question

Re: Triangular PV question

Re: Triangular PV question

Re: Triangular PV question

Re: Triangular PV question

Re: Triangular PV question

Re: Triangular PV question

Re: Triangular PV question

Re: Triangular PV question

Re: Triangular PV question