LIFO stack based parallel processing?

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

smatovic
Posts: 2641
Joined: Wed Mar 10, 2010 10:18 pm
Location: Hamburg, Germany
Full name: Srdja Matovic

Re: LIFO stack based parallel processing?

Post by smatovic »

...try and fail.

It is nearly impossible to implement a parallel LIFO stack on a GPU considering that threads inside a SIMD Unit are again divided into Warps/Wavefronts.

..fail and error.

fixed an bug and now i am able to distribute work by use of a LIFO Stack across threads within an SIMD Unit of a GPU.

But still don't have an idea how to handle AlphaBeta values, any suggestions?

Here is the LIFO scheme i use in pseudo code

Code: Select all


    // while there are boards in stack do 0-n in parallel
    while(board_stack_top > 0) {

        // every process gets an index for computation
        index = board_stack_top - process_id;

        movecounter[process_id] = 0;

        if (index > 0) {

            board = board_stack[index-1];

            moves[] = genMoves(board);

            moves.sort;
        
            movecounter[process_id] = moves.count;
    
            // decrease top
            atomic_decrease(board_stack_top);
        }


        // sync all processes in a SIMD Unit
        SYNC;

        // get index for stack copy
        moveindex = board_stack_top;

        for&#40;i=0;i<process_id;i++) &#123;
            moveindex+= movecoutner&#91;i&#93;;
        &#125;        

        // copy boards to shared stack
        moves.each do |move, i| &#123;

            domove&#40;board, move&#41;;

            board_stack&#91;moveindex+i&#93; = board;

            undomove&#40;board, move&#41;;

        &#125;

        // increase top
        atomic_add&#40;board_stack_top, moves.count&#41;;

        // sync all processes in a SIMD Unit
        SYNC;

    &#125;
            


--
Srdja
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: LIFO stack based parallel processing?

Post by bob »

Zach Wegner wrote:You could easily make a lockless queue and not have any spinlocks.

The problem is that with alpha-beta you need to delete elements from the middle of the queue. So I don't think it could be done in any efficient way...
Lockless programming is dangerous. Does your architecture of choice guarantee that memory writes from a single CPU are done "in order"? Not all do. From running on lots of different architectures over the years, and doing parallel programming on many of 'em, spinlocks are not nearly as bad as they are made out to be, and they solve a lot of issues that will cause lots of headaches if you decide to change platforms because something new and better comes out.

The old DEC alpha is a good example for doing out of order writes...
smatovic
Posts: 2641
Joined: Wed Mar 10, 2010 10:18 pm
Location: Hamburg, Germany
Full name: Srdja Matovic

Re: LIFO stack based parallel processing?

Post by smatovic »

But still don't have an idea how to handle AlphaBeta values, any suggestions?
My quick n dirty solution is just to let the score-tree grow in memory....
Here is the LIFO scheme i use in pseudo code
i had to realize that without a linked list i have to visit every generated position to do the same beta cutoff...which slows things significant down :-(

--
Srdja