MCTS-like algorithm with alpha-beta rollouts

Discussion of chess software programming and technical issues.

Moderator: Ras

User avatar
RedBedHed
Posts: 84
Joined: Wed Aug 04, 2021 12:42 am
Full name: Ellie Moore

MCTS-like algorithm with alpha-beta rollouts

Post by RedBedHed »

Hey all!

I've been working on a new-ish algorithm based on the paper https://www.semanticscholar.org/paper/P ... 82d4740db9 and the engine "Scorpio MCTS." It uses alpha-beta rollouts in an iterative deepening fashion, taking advantage of the classical dynamic move-ordering scheme.

Unlike Scorpio, it uses regular backtracking alpha-beta for IID and searches with null windows, like the scout searches and null move searches. It also uses regular quiescence search. It doesn't reuse the tree, as reuse only seems to slow things down. Instead, it passes the tree from each iteration to a garbage collection thread running in the background.

It can play at about the same strength as the classical algo, but I think it has a unique advantage:

Printing a PV line from any node is as simple as following a chain of pointers through the part of the game tree stored in heap memory. And, for a 16-ply search, this tree can contain fewer than 500,000 nodes. I have yet to try any other method for multi-PV printing... is there a better method out there?

The last big thing I need to add to these algorithms is "draw by repetition" detection. However, when I try to handle this by iterating through the history and detecting the first identical position, the playing strength of my engine drops noticeably. Is there something that I am missing? (This code is a little ugly; please don't judge it too harshly, lol)

Code: Select all

    inline int repeating(Board* b, int ply) {
        int i = ply;
        State* state = b->getState();
        uint64_t key = state->key;
        State* s = state->prevState; 
        while(s && i-- > 0) {
            if(s->key == key)
                return i;
            s = s->prevState;
        }
        return 0;
    }
>> Move Generator: Charon
>> Engine: Homura
void life() { playCapitalism(); return; live(); }
User avatar
Ras
Posts: 2696
Joined: Tue Aug 30, 2016 8:19 pm
Full name: Rasmus Althoff

Re: MCTS-like algorithm with alpha-beta rollouts

Post by Ras »

RedBedHed wrote: Mon Jan 23, 2023 6:38 amiterating through the history and detecting the first identical position
You can decrement i by 2 because an identical position means that the side to move must be the same, but you can end prematurely when you encounter a move such as a capture or pawn push. I'm not sure whether your code does that, but you may need to go back further than the root position, i.e. also include the positions of the game history as transmitted by the moves list in the UCI go command.
Rasmus Althoff
https://www.ct800.net
User avatar
RedBedHed
Posts: 84
Joined: Wed Aug 04, 2021 12:42 am
Full name: Ellie Moore

Re: MCTS-like algorithm with alpha-beta rollouts

Post by RedBedHed »

Ras wrote: Mon Jan 23, 2023 7:11 am
RedBedHed wrote: Mon Jan 23, 2023 6:38 amiterating through the history and detecting the first identical position
You can decrement i by 2 because an identical position means that the side to move must be the same, but you can end prematurely when you encounter a move such as a capture or pawn push. I'm not sure whether your code does that, but you may need to go back further than the root position, i.e. also include the positions of the game history as transmitted by the moves list in the UCI go command.
Hmmm that does make sense... I did initially try going back 2 states at a time... Is it normal for these methods to affect playing strength?
>> Move Generator: Charon
>> Engine: Homura
void life() { playCapitalism(); return; live(); }
User avatar
RedBedHed
Posts: 84
Joined: Wed Aug 04, 2021 12:42 am
Full name: Ellie Moore

Re: MCTS-like algorithm with alpha-beta rollouts

Post by RedBedHed »

RedBedHed wrote: Mon Jan 23, 2023 6:38 am Hey all!

I've been working on a new-ish algorithm based on the paper https://www.semanticscholar.org/paper/P ... 82d4740db9 and the engine "Scorpio MCTS." It uses alpha-beta rollouts in an iterative deepening fashion, taking advantage of the classical dynamic move-ordering scheme.

Unlike Scorpio, it uses regular backtracking alpha-beta for IID and searches with null windows, like the scout searches and null move searches. It also uses regular quiescence search. It doesn't reuse the tree, as reuse only seems to slow things down. Instead, it passes the tree from each iteration to a garbage collection thread running in the background.

It can play at about the same strength as the classical algo, but I think it has a unique advantage:

Printing a PV line from any node is as simple as following a chain of pointers through the part of the game tree stored in heap memory. And, for a 16-ply search, this tree can contain fewer than 500,000 nodes. I have yet to try any other method for multi-PV printing... is there a better method out there?

The last big thing I need to add to these algorithms is "draw by repetition" detection. However, when I try to handle this by iterating through the history and detecting the first identical position, the playing strength of my engine drops noticeably. Is there something that I am missing? (This code is a little ugly; please don't judge it too harshly, lol)

Code: Select all

    inline int repeating(Board* b, int ply) {
        int i = ply;
        State* state = b->getState();
        uint64_t key = state->key;
        State* s = state->prevState; 
        while(s && i-- > 0) {
            if(s->key == key)
                return i;
            s = s->prevState;
        }
        return 0;
    }

Edit: looks like I spoke a little too soon. The PVs are incomplete for most children of the root... And that makes sense because of the scout searches. :(
>> Move Generator: Charon
>> Engine: Homura
void life() { playCapitalism(); return; live(); }