Elo expected from LMR

algerbrex · Post by **algerbrex** » Mon Oct 11, 2021 6:57 pm

Hi,

I've been messing around with a very simple form of late-move reductions, and so far, I've gotten 32 Elo from them, which is pretty nice. From my testing so far, LMR seems to put the dev version of Blunder at around ~2250, based on some other engines around that level on the CCRL:

Code: Select all

Rank Name                          Elo     +/-   Games   Score    Draw 
   0 Blunder 7.3.0                  48      11    3200   56.8%   16.3% 
   1 Stash v17.0                    -3      23     800   49.6%   11.1% 
   2 Napoleon                      -39      21     800   44.4%   27.0% 
   3 SpaceDog                      -70      23     800   40.0%   13.8% 
   4 CeeChess_v1.3                 -80      23     800   38.8%   13.5%

The reduction code I came up with is as follows:

Code: Select all

score := int16(0)
historyScore := search.history[search.Pos.SideToMove][move.FromSq()][move.ToSq()]

if legalMoves > 4 && depth > 3 && historyScore < 50 && !inCheck {
	newDepth := depth - 2
	if legalMoves > 6 {
		newDepth--
	}

	score = -search.negamax(newDepth, ply+1, -beta, -alpha, &childPVLine, true)
	if score > alpha {
		score = -search.negamax(depth-1, ply+1, -beta, -alpha, &childPVLine, true)
	}
} else {
        score = -search.negamax(depth-1, ply+1, -beta, -alpha, &childPVLine, true)
}

I had some more complicated logic than the above, but from testing, I found that the above was equal in strength to it and the code is simpler, so it's what I settled on.

From the reading I've done, LMR can be a fickle beast and is one of those features where it can work very well for some programs, but be a loss for others. So I'm curious what others have gained from LMR so I can get a ballpark idea of how much more Elo I can squeeze out of it, or if I'm making a silly mistake somewhere.

Also, note the 32 Elo doesn't include the history heuristics. LMR + history heuristics gained ~60-70 Elo for Blunder together. And I also realize that LMR + PVS is common, but see here for why I'm not using PVS right now: forum3/viewtopic.php?f=7&t=78183&sid=22 ... f177c93bf3

Joost Buijs · Post by **Joost Buijs** » Mon Oct 11, 2021 7:21 pm

My experience is that the quality of the evaluation has a very big influence on the gain you can get with LMR.
With a better evaluation function you can use more aggressive LMR without making big mistakes.
It's all connected to each other, the quality of the evaluation, history and LMR, so it's very difficult to predict what the gain should be.

algerbrex · Post by **algerbrex** » Mon Oct 11, 2021 7:51 pm

Joost Buijs wrote: ↑Mon Oct 11, 2021 7:21 pm My experience is that the quality of the evaluation has a very big influence on the gain you can get with LMR.
With a better evaluation function you can use more aggressive LMR without making big mistakes.
It's all connected to each other, the quality of the evaluation, history and LMR, so it's very difficult to predict what the gain should be.

Hmm, thanks, I see. That makes sense. I've recently finished writing a texel tuner for Blunder and I used it to add mobility to my evaluation. So I suppose I'll keep improving the evaluation right now and see how that pans out with LMR.

algerbrex · Post by **algerbrex** » Mon Oct 11, 2021 9:17 pm

algerbrex wrote: ↑Mon Oct 11, 2021 7:51 pm
Joost Buijs wrote: ↑Mon Oct 11, 2021 7:21 pm My experience is that the quality of the evaluation has a very big influence on the gain you can get with LMR.
With a better evaluation function you can use more aggressive LMR without making big mistakes.
It's all connected to each other, the quality of the evaluation, history and LMR, so it's very difficult to predict what the gain should be.
Hmm, thanks, I see. That makes sense. I've recently finished writing a texel tuner for Blunder and I used it to add mobility to my evaluation. So I suppose I'll keep improving the evaluation right now and see how that pans out with LMR.

From observing more games played, it looks like LMR is stronger at longer time controls, which makes sense. The 32 Elo rating came from me using SPRT testing with a time control of 3+0.08s. The gauntlet I also ran to test Blunder's current strength used the same time control. I'm now running another gauntlet with a time control of 15+0.1 with the same pool of engines, and the preliminary results look pretty promising.

Karlo Bala · Post by **Karlo Bala** » Mon Oct 11, 2021 9:24 pm

algerbrex wrote: ↑Mon Oct 11, 2021 7:51 pm
Joost Buijs wrote: ↑Mon Oct 11, 2021 7:21 pm My experience is that the quality of the evaluation has a very big influence on the gain you can get with LMR.
With a better evaluation function you can use more aggressive LMR without making big mistakes.
It's all connected to each other, the quality of the evaluation, history and LMR, so it's very difficult to predict what the gain should be.
Hmm, thanks, I see. That makes sense. I've recently finished writing a texel tuner for Blunder and I used it to add mobility to my evaluation. So I suppose I'll keep improving the evaluation right now and see how that pans out with LMR.

Just to add that Stockfish NNUE reaches 20+ depths within few millions of nodes. Any beancounter aka Fritz6 with such heavy pruning/reduction would be blind as a bat.

lithander · Post by **lithander** » Wed Oct 13, 2021 1:55 pm

Code: Select all

expandedNodes++;
bool interesting = expandedNodes == 1 || isChecked || child.IsChecked(child.SideToMove);

//moves after the PV node are unlikely to raise alpha. 
//avoid a full evaluation by searching with a null-sized window around alpha first
//...we expect it to fail low but if it does not we have to research it!
if (depth >= 2 && expandedNodes > 1)
{
	//non-tactical late moves are searched at a reduced depth to make this test even faster!
	int R = (interesting || expandedNodes < 4) ? 0 : 2;
	(int score, _) = EvalPositionTT(child, ply + 1, depth - R - 1, window.GetLowerBound(color));
	if (window.FailLow(score, color))
		continue;
}

//this move is expected to raise alpha so we search it at full depth!
var eval = EvalPositionTT(child, ply + 1, depth - 1, window);

I use LMR in a very simple way in MinimalChess. Unless the move is the PV node, giving check or evading check it's not 'interesting'. Beginning with the 4th move, if the move is not interesting in the above described way it may be reduced by 2 in the PVS-style part of the search where you decide on a minimal window around alpha whether to fully search a node or not. If this search turns out to beat alpha than we will research the move at full depth.

Took a while to get it to work. The trick was - if I remember correctly - to also introduce history sorting of quiet moves. Before sorting quiet moves I couldn't really get LMR to work for my engine. Also all my reductions are always multiples of 2 (LMR also) and that kind of symmetry that you call the eval always on positions where the same color is to move seemed to help with ensuring that evaluations are actually comparable.

algerbrex · Post by **algerbrex** » Wed Oct 13, 2021 2:36 pm

lithander wrote: ↑Wed Oct 13, 2021 1:55 pm Took a while to get it to work. The trick was - if I remember correctly - to also introduce history sorting of quiet moves. Before sorting quiet moves I couldn't really get LMR to work for my engine. Also all my reductions are always multiples of 2 (LMR also) and that kind of symmetry that you call the eval always on positions where the same color is to move seemed to help with ensuring that evaluations are actually comparable.

Right, that's what I noticed too. LMR didn't become an Elo gained for me until I introduced history heuristics. And your idea of only reducing by multiples of 2 makes sense to me I believe, I just need to think about it a bit.

If you tested this, do you remember how much Elo LMR was able to give you?

lithander · Post by **lithander** » Wed Oct 13, 2021 11:06 pm

algerbrex wrote: ↑Wed Oct 13, 2021 2:36 pm If you tested this, do you remember how much Elo LMR was able to give you?

I just tested it with 2000 games at 5s + 0.5s increment in selfplay:

Code: Select all

   # PLAYER                       :  RATING  POINTS  PLAYED   (%)
   1 MinimalChess 0.6.1           :  2345.7  1244.5    1983    63
   2 MinimalChess 0.6.1 No LMR    :  2254.3   738.5    1983    37

The only change between the two versions is:

Code: Select all

int R = 0;//int R = (interesting || expandedNodes < 4) ? 0 : 2;

So LMR is worth about 90 ELO in selfplay. (The gained depth is really significant!)

algerbrex · Post by **algerbrex** » Wed Oct 13, 2021 11:29 pm

lithander wrote: ↑Wed Oct 13, 2021 11:06 pm
algerbrex wrote: ↑Wed Oct 13, 2021 2:36 pm If you tested this, do you remember how much Elo LMR was able to give you?
I just tested it with 2000 games at 5s + 0.5s increment in selfplay:
Code: Select all
   # PLAYER                       :  RATING  POINTS  PLAYED   (%)
   1 MinimalChess 0.6.1           :  2345.7  1244.5    1983    63
   2 MinimalChess 0.6.1 No LMR    :  2254.3   738.5    1983    37
The only change between the two versions is:
Code: Select all
int R = 0;//int R = (interesting || expandedNodes < 4) ? 0 : 2;
So LMR is worth about 90 ELO in selfplay. (The gained depth is really significant!)

Nice!

So far I've recorded about ~32 Elo, but I just gained another good bit from adding in PVS, so it brings the total to around 47 Elo. Which isn't bad, but I expected numbers more like yours, so I'm going to keep playing around with it. Thanks!

amanjpro · Post by **amanjpro** » Thu Oct 14, 2021 12:31 am

algerbrex wrote: ↑Mon Oct 11, 2021 9:17 pm
algerbrex wrote: ↑Mon Oct 11, 2021 7:51 pm
Joost Buijs wrote: ↑Mon Oct 11, 2021 7:21 pm My experience is that the quality of the evaluation has a very big influence on the gain you can get with LMR.
With a better evaluation function you can use more aggressive LMR without making big mistakes.
It's all connected to each other, the quality of the evaluation, history and LMR, so it's very difficult to predict what the gain should be.
Hmm, thanks, I see. That makes sense. I've recently finished writing a texel tuner for Blunder and I used it to add mobility to my evaluation. So I suppose I'll keep improving the evaluation right now and see how that pans out with LMR.
From observing more games played, it looks like LMR is stronger at longer time controls, which makes sense. The 32 Elo rating came from me using SPRT testing with a time control of 3+0.08s. The gauntlet I also ran to test Blunder's current strength used the same time control. I'm now running another gauntlet with a time control of 15+0.1 with the same pool of engines, and the preliminary results look pretty promising.

3+0.08s is too short to be meaningful really, even SF's STC is longer than that... I would suggest to use at least 8+0.8s or something a long this line, after all you want to exercise search prunings well before merging in a change

Elo expected from LMR

Elo expected from LMR

Re: Elo expected from LMR

Re: Elo expected from LMR

Re: Elo expected from LMR

Re: Elo expected from LMR

Re: Elo expected from LMR

Re: Elo expected from LMR

Re: Elo expected from LMR

Re: Elo expected from LMR

Re: Elo expected from LMR