Stockfish randomicity

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

amchess
Posts: 331
Joined: Tue Dec 05, 2017 2:42 pm

Stockfish randomicity

Post by amchess »

I have noticed that when running a game, even with LTC starting from a specific position, Stockfish and its derivatives do not always play the same move. There is randomness, therefore, in the algorithm. Analyzing the code, the only point where it appears in the search is the following:

Code: Select all

// Choose best move. For each move score we add two terms, both dependent on
// weakness. One is deterministic and bigger for weaker levels, and one is
// random. Then we choose the move with the resulting highest score.
for (size_t i = 0; i < multiPV; ++i)
{
// This is our magic formula
int push = int(( weakness * int(topScore - rootMoves[i].score)
+ delta * (rng.rand() % int(weakness))) / 128);

    if (rootMoves[i].score + push >= maxScore)
    {
        maxScore = rootMoves[i].score + push;
        best = rootMoves[i].pv[0];
    }
}
Why was this randomness introduced in the selection of the "best move"? In fact, it is more pronounced in sharp positions, and if you want to test an engine, you have to choose such positions since the nnue has significantly raised the level of play...
Joerg Oster
Posts: 939
Joined: Fri Mar 10, 2006 4:29 pm
Location: Germany
Full name: Jörg Oster

Re: Stockfish randomicity

Post by Joerg Oster »

amchess wrote: Thu Sep 21, 2023 12:16 pm I have noticed that when running a game, even with LTC starting from a specific position, Stockfish and its derivatives do not always play the same move. There is randomness, therefore, in the algorithm. Analyzing the code, the only point where it appears in the search is the following:

Code: Select all

// Choose best move. For each move score we add two terms, both dependent on
// weakness. One is deterministic and bigger for weaker levels, and one is
// random. Then we choose the move with the resulting highest score.
for (size_t i = 0; i < multiPV; ++i)
{
// This is our magic formula
int push = int(( weakness * int(topScore - rootMoves[i].score)
+ delta * (rng.rand() % int(weakness))) / 128);

    if (rootMoves[i].score + push >= maxScore)
    {
        maxScore = rootMoves[i].score + push;
        best = rootMoves[i].pv[0];
    }
}
Why was this randomness introduced in the selection of the "best move"? In fact, it is more pronounced in sharp positions, and if you want to test an engine, you have to choose such positions since the nnue has significantly raised the level of play...
This is only for playing with Skill Levels.
Jörg Oster
amchess
Posts: 331
Joined: Tue Dec 05, 2017 2:42 pm

Re: Stockfish randomicity

Post by amchess »

Yes, my error, but the problem persists.
In fact, it is related to the lazy-smp and multi-threading of the OS.
With nnue, truly sharp positions must be chosen for testing.
By definition, such are those for which the first and second choices are considered more or less equivalent, but lead one to a draw and the other to victory.
So the sharper the positions, the more random the results obtainable. How to be sure, then, of the effectiveness of a patch (= an elo increase), especially at long times when too many games cannot be run?
syzygy
Posts: 5569
Joined: Tue Feb 28, 2012 11:56 pm

Re: Stockfish randomicity

Post by syzygy »

amchess wrote: Thu Sep 21, 2023 8:27 pm Yes, my error, but the problem persists.
In fact, it is related to the lazy-smp and multi-threading of the OS.
A deterministic yet efficient parallel search is not possible to achieve.

A single-threaded search is deterministic, but this a kind of fake determinism. Change the hash table size, you get different moves. Change the Zobrist values, you get different moves. Change anything, you get different moves.
How to be sure, then, of the effectiveness of a patch (= an elo increase), especially at long times when too many games cannot be run?
The only way to know whether a patch improves play is to play thousands of different games. If anything, lack of determinism is an advantage, not a disadvantage.
amchess
Posts: 331
Joined: Tue Dec 05, 2017 2:42 pm

Re: Stockfish randomicity

Post by amchess »

Of course.
The problem is that to play hundreds of games, you have to use very short time controls.
This favors patches with very thick cuts (pruning) that are not necessarily useful at longer times.
Therefore, Stockfish performs worse at long times and in solving complicated positions.
The statistics of the various patches in the Stockfish framework also show this.
Red patches have been integrated at very fast time controls, but not already at LTC (for them, 10s+1s !).
One idea could be a tournament where the engine plays not only against itself, but also against engines of more or less similar strength to see how it rips them compared to the previous version.
I also think that test positions cannot be random because it is getting harder and harder to find ones with results that are not basically "decided." Here, chess knowledge might intervene for an intelligent testing strategy.
In short, imho, there is food for thought and a lot of it....
Ciekce
Posts: 125
Joined: Sun Oct 30, 2022 5:26 pm
Full name: Conor Anstey

Re: Stockfish randomicity

Post by Ciekce »

SF does test every patch at LTC.
amchess
Posts: 331
Joined: Tue Dec 05, 2017 2:42 pm

Re: Stockfish randomicity

Post by amchess »

LTC = 10s+0.1s
but very long time control is a problem of time and resources.
pgg106
Posts: 25
Joined: Wed Mar 09, 2022 3:40 pm
Full name: . .

Re: Stockfish randomicity

Post by pgg106 »

LTC = 60s + 0.6*
and that's excluding all the tests that are also tried at vltc or with more than one core, with how many sf ltcs you merged you should know by now that the time control is.
To anyone reading this post in the future, don't ask for help on talkchess, it's a dead site where you'll only get led astray, the few people talking sense here come from the Stockfish discord server, just join it and actual devs will help you.
connor_mcmonigle
Posts: 533
Joined: Sun Sep 06, 2020 4:40 am
Full name: Connor McMonigle

Re: Stockfish randomicity

Post by connor_mcmonigle »

amchess wrote: Sat Sep 23, 2023 10:27 am ...
chess knowledge might intervene for an intelligent testing strategy.
...
You've not proposed any alternative nor offered any explanation in regards to your claim that the chaotic behavior (in the sense of high sensitivity to initial conditions) of Stockfish's search makes testing difficult. Not much food for thought as I see it and "chess knowledge" seems entirely irrelevant. Playing chess well is about playing "good moves" (moves which preserve the game theoretic value of a position) with high probability and "bad moves" (moves which change the game theoretic value of a position) with low probability. If you slightly tweak SF's initial search conditions, you can sort of sample from SF's latent policy over the action space (and uncover that latent policy by performing millions of searches and collecting statistics). Good patches are patches which yield latent policies which assign higher probability to "good moves" relative to the previous version (over the stationary distribution of positions as determined by said policy).
syzygy
Posts: 5569
Joined: Tue Feb 28, 2012 11:56 pm

Re: Stockfish randomicity

Post by syzygy »

amchess wrote: Sat Sep 23, 2023 10:27 am Of course.
The problem is that to play hundreds of games, you have to use very short time controls.
This favors patches with very thick cuts (pruning) that are not necessarily useful at longer times.
Therefore, Stockfish performs worse at long times and in solving complicated positions.
Totally different topic. You just seem to be looking for something to complain about.

The pros and cons of testing at ultrabullet time controls have been discussed to death already. Reality shows that it works.