Triangular PV question

sje · Post by **sje** » Fri Aug 09, 2013 6:07 pm

Symbolic doesn't use a triangular PV array. It can't, as its search has no limit on depth and so has no limit on the length of a PV at any node. Further, there is no copying of a PV up one ply.

This is all accomplished by means of two-way linked lists of moves, built using a generic linked list C++ template. Once entered into a list as part of a move list node, a move is never moved although its links may change.

On entry into a tree node, the local PV is assigned an empty list. At the return from a immediate descendent position node, if a new best-so-far result is inside the A/B window, then the local PV is set to the tried move and has the descendant's local PV appended (not copied, just de-linked and then re-linked).

Each search thread has a move node factory/recycler. This is just another move list which stores recycled move nodes for later use. If there are no spare nodes when a new move node is requested, then the factory reluctantly calls the C++ runtime storage allocator (an uncommon event). The move list destructor cleans up when all is done.

The search result PV is just the most recent PV reported from the ply zero (root) position node processor. A report is not made if there was a search abort request as the result cannot be trusted in that case.

In addition to a PV at each position node in the current path from the root, each search thread has a CV (current variation) representing the path. This is just another move list. As each position node is entered, the transition move is appended. On position node exit, the node is recycled. The CV is used mostly for debugging.

The position instance itself has a CV from the beginning of the game to the current position node in the search thread. Each position instance has its own private move node factory/recycler for the path, plus a state factory/recycler. Thus enables calls to Execute()/Retract() to be very simple; the call to Execute() takes only one parameter (the move). Retract() takes no parameters. There's none of this foolishness of having data structures separate from the position for storing move/state information.

Just say NO to fixed length arrays for search.

abulmo · Post by **abulmo** » Sat Aug 10, 2013 9:58 am

just say NO to linked list:

http://www.youtube.com/watch?v=YQs6IC-vgmo

sje · Post by **sje** » Sat Aug 10, 2013 3:14 pm

abulmo wrote:just say NO to linked list:

http://www.youtube.com/watch?v=YQs6IC-vgmo

Just say NO to inapplicable lectures.

A big deal is made from the observation that accessing the Nth element from the start of the structure takes O(1) for a vector but O(N) for a list. The problem here is that his argument is totally irrelevant to chess variations and other ply indexed structures where there is never such access. The elements are all processed in order (or reverse order).

He talks about allocation/allocation overhead, but this is all solved with node recycling. A search tree might need 100 million nodes, but only a few hundred nodes maximum at any one time.

He talks about pointer storage overhead, but when a node contains a lot of state information, then any pointer overhead is negligible. Concerning memory availability, he seems stuck in the 1980s when a typical PC had only 640 KB or less.

He talks about how machines don't like dereferencing but appears to forget that every object on the stack, or in the heap, or is an instance variable needs dereferencing. Does he expect a program to access only static data of global variables?

He talks about the added complexity of list operations, but all of the code for this is contained in the templates and the details can be ignored by the template user.

He doesn't talk about the ease of prepending or appending a node to a two-way list. The prepend/append calls are needed just about all the time in a tree search and is expensive to do with a magically expanding/contracting vector. And a fixed length vector is risky because of possible overflow.

Now there are places for fixed length arrays. Symbolic uses a fixed length, 256 element array for move generation output. This works because there is no known chess position which can have more than 212 moves. More importantly, there is a need to sort this vector at times and that's one area where array storage is faster than a linked list.

Symbolic has a few other fixed length arrays, like the endgame classifier which uses a 292 element vector, the length fixed because the number of covered endgame classes is fixed.

And if there is ever a need to sort a linked list (haven't seen this yet in Symbolic), a temporary array of exactly the right size can be created, the list can be emptied into the array, and after the sort, the list is re-created from the vector contents.

Don · Post by **Don** » Sat Aug 10, 2013 6:34 pm

sje wrote:
abulmo wrote:just say NO to linked list:

http://www.youtube.com/watch?v=YQs6IC-vgmo
Just say NO to inapplicable lectures.

A big deal is made from the observation that accessing the Nth element from the start of the structure takes O(1) for a vector but O(N) for a list. The problem here is that his argument is totally irrelevant to chess variations and other ply indexed structures where there is never such access. The elements are all processed in order (or reverse order).

He talks about allocation/allocation overhead, but this is all solved with node recycling. A search tree might need 100 million nodes, but only a few hundred nodes maximum at any one time.

He talks about pointer storage overhead, but when a node contains a lot of state information, then any pointer overhead is negligible. Concerning memory availability, he seems stuck in the 1980s when a typical PC had only 640 KB or less.

He talks about how machines don't like dereferencing but appears to forget that every object on the stack, or in the heap, or is an instance variable needs dereferencing. Does he expect a program to access only static data of global variables?

He talks about the added complexity of list operations, but all of the code for this is contained in the templates and the details can be ignored by the template user.

He doesn't talk about the ease of prepending or appending a node to a two-way list. The prepend/append calls are needed just about all the time in a tree search and is expensive to do with a magically expanding/contracting vector. And a fixed length vector is risky because of possible overflow.

Now there are places for fixed length arrays. Symbolic uses a fixed length, 256 element array for move generation output. This works because there is no known chess position which can have more than 212 moves. More importantly, there is a need to sort this vector at times and that's one area where array storage is faster than a linked list.

Symbolic has a few other fixed length arrays, like the endgame classifier which uses a 292 element vector, the length fixed because the number of covered endgame classes is fixed.

And if there is ever a need to sort a linked list (haven't seen this yet in Symbolic), a temporary array of exactly the right size can be created, the list can be emptied into the array, and after the sort, the list is re-created from the vector contents.

But you made it sound like your way using linked lists was the only right way - I think that is what people are reacting to. Especially with your "just say no" comment.

sje · Post by **sje** » Sat Aug 10, 2013 8:20 pm

Don wrote:But you made it sound like your way using linked lists was the only right way - I think that is what people are reacting to. Especially with your "just say no" comment.

I wrote "Just say NO to fixed length arrays for search". Fixed length arrays are fine in other contexts, but not for objects like move lists or state preservation lists.

Perhaps I should have written "Just say NO to a MAX_PLY constant".

Don · Post by **Don** » Sat Aug 10, 2013 8:25 pm

sje wrote:
Don wrote:But you made it sound like your way using linked lists was the only right way - I think that is what people are reacting to. Especially with your "just say no" comment.
I wrote "Just say NO to fixed length arrays for search". Fixed length arrays are fine in other contexts, but not for objects like move lists or state preservation lists.

Perhaps I should have written "Just say NO to a MAX_PLY constant".

This makes your program ultimately more scalable than mine. Pure theory of course, but given enough CPU power, your program would beat mine as mine would stop thinking at depth 99.

That is my iteration max, not the maximum ply however. A 99 ply search would play strong of course but I still think it would still need another 99 to play almost perfect chess and it would still not be perfect - just good!

Don

Codesquid · Post by **Codesquid** » Sat Aug 10, 2013 8:51 pm

The actual problem with linked lists is its inherent cache-unfriendlyness. In the worst case, traversing a linked list with N elements will result in N different cachelines are being loaded from main memory, even if all N elements would fit into a single cacheline. Consider using a deque instead.

sje · Post by **sje** » Sat Aug 10, 2013 9:02 pm

Don wrote:
sje wrote:Perhaps I should have written "Just say NO to a MAX_PLY constant".
This makes your program ultimately more scalable than mine. Pure theory of course, but given enough CPU power, your program would beat mine as mine would stop thinking at depth 99.

A ply limit of 99 would be insufficient in some cases already seen: the expansion of a PV by means of tablebases. Some endings in KQPKQ can run to 250 ply and would be unnecessarily truncated by a MAX_PLY of anything less than 250.

--------

Think of all the PV management topics which have appeared here and on other sites. The fixed length array treatment has nearly always caused confusion among newcomers. Their initial attempts at implementing a PV in their code are surely filled with bugs, mostly related to subscripting problems and copying problems. The fact that so many have difficulties with this particular issue is a clue that maybe the conventional wisdom is mistaken and has been mistaken for some time.

Compare the many pages written about fixed limit PV operations with the simple three rules of move list PV operations:

1. At node entry. set the PV local variable to the empty list.

2. In the move scan, whenever a sub-search result score is inside the A/B window, replace the PV with a new PV consisting of the current move followed by the PV of the sub-search.

3. At node exit, return the local PV (and score) to the caller.

In Lisp, the second part takes only two lines of code:

Code: Select all

&#40;when &#40;window-inside window tryscore&#41;
    &#40;setf PV &#40;cons trymove SubPV&#41;)

And with the right list management templates, the second step in a C++ program also takes two lines of code.

mar · Post by **mar** » Sat Aug 10, 2013 9:22 pm

It depends on what you are trying to accomplish.
Linked lists (especially intrusive) are good in cases where you maintain one (or more) list of heavy objects, you don't want to put these in a vector or similar type of container.
Vectors are compact and are very good for lightweight elements. Linked lists are cache unfriendly but vectors OTOH can cause memory fragmentation.
I for example prefer having a vector of pointers instead of a list, because I can have random access that way (of course I'm talking about heavy objects here).

sje · Post by **sje** » Sat Aug 10, 2013 10:12 pm

Codesquid wrote:The actual problem with linked lists is its inherent cache-unfriendlyness. In the worst case, traversing a linked list with N elements will result in N different cachelines are being loaded from main memory, even if all N elements would fit into a single cacheline. Consider using a deque instead.

For chess search purposes, a deque is not as good as a linked list. See: http://www.cplusplus.com/reference/deque/deque/

A deque must be built and maintained in such a way that O(1) access time is needed to get to randomly selected element. That takes a lot of work because there must be an extra, hidden index vector which itself can expand and contract.

In practice, I have found the move list approach to be very fast. Example: In Symbolic, the CV is a move list from the start of the search to the current node. It gets a new/append and a detach/delete operation on every position node. Turning the CV on and off shows no difference in timing tests.

I have also tried having the factor/recycler allocate a few thousand move nodes prior to the search start. This would mean that the nodes were (likely) all close together in memory. This also had no affect on timing.

Triangular PV question

Just say NO

Re: Just say NO

Re: Just say NO

Re: Just say NO

Re: Just say NO

Re: Just say NO

Re: Just say NO

Re: Just say NO

Re: Just say NO

Re: Just say NO