LMR in micro-Max

hgm · Post by **hgm** » Sat Apr 07, 2007 8:44 pm

Thanks to Jonatan Pettersson, for pointing out that very simple LMR schemes can already result in a large improvement, I decided to try LMR in micro-Max. That this would work was far from obvious, as uMax does not have any move ordering! Except for the hash move, all moves are simply searched in the order they are generated (i.e. per piece, in board order). So how do you define a 'late move' in that case?

What I tried is simply reduce all non-captures, except the hash move. The idea was that the first 4 or 5 moves that people usually exempt from reduction would be good or equal captures anyway. (Killers is an unknown concept to uMax, as these can only be effective if they are ordered properly.)

This amazingly simple scheme seems to work! In a first self-play test the version of uMax 4.4 with LMR added scored 55.8% (67+ 89= 44-) against that without LMR. A less draw-prone version (larger Pawn-push bonus in the end game) even scores 56.5% (80+ 66= 54-). In these tests, play with LMR was 10% faster (time is controlled by searching a fixed number of nodes, and then finishing the iteration). This is all in very fast games, (30 sec per game), the advantage of LMR might go up at larger search depth.

I am still going to experiment also exempting all Pawn moves from reduction too. uMax unfortunately does not know about passers, so reducing Pawn moves might severely hurt its end-game ability. In the middle game the numbr of Pawn moves is probably insignificant enough for this to be not very costly. Currently the reduction is applied to any search depth >= 4 ply (i.e. the normal reply depth of 3 is then reduced to 2). I have not optimized this parameter either, and I can imagine that the optimum for this parameter depends on the time control too.

So there seems to be a significant improvement even from this simple form of LMR. The next question, of course, is if the implementation of it can be shrunk to a size smaller than the ELO gain...

Zlaire · Post by **Zlaire** » Sat Apr 07, 2007 9:31 pm

Mediocre still uses a very simple form of LMR and there is no question the gains are noticeable.

I'm sure a more sophisticated algorithm for choosing what moves to reduce will perform even better, however the 5-6 lines of code to implement the simplest form should be worth it for just about any engine.

hgm · Post by **hgm** » Sat Apr 07, 2007 9:54 pm

Yes, I think I will also try this in Joker. Of course in uMax 6 lines is already much, it must be more something like 30 characters...

But I have good hopes that with this simple scheme, that might be possible.

A next refinement, which might also be affordable in uMax, would be to undo the reduction if in the following node the null move is refuted by a move with that same piece. I will measure the effect of that too. I have not even tried the not-in-check condition either...

hgm · Post by **hgm** » Sun Apr 08, 2007 6:37 pm

I tried exempting Pawn moves from reduction (next to captures), and it did not make much difference scorewise (still 56% over 120 games). Only the 10% speed advantage disappeared. (It is much easier to do this in uMax, though, as Pawn double moves cannot be tried first even if they are hash moves. They are then tried in second place, which might lead to reduction of the hash move if you do not ad a lot of code. If Pawn moves are never reduced, this problem disappears automatically.)

To my surprise, at 10 times longer time-control, (5 min per game) the result stays about the same: 56.2% (47+ 41= 32-), Pawn moves + captures exempted, Pawn pushing version. I would have expected the advantage to grow with search depth, as LMR is a branching-ratio changing pruning. Apparently there is no extra advantage in thinking further ahead around the PV than in the exhaustive full-width search part beyond a certain number of moves. So I am increasing the minimum remaining depth at which LMR is applied from d=4 to d=5, which decreases the difference n depth between deepest and worst branches.

Uri Blass · Post by **Uri Blass** » Sun Apr 08, 2007 7:10 pm

hgm wrote:I tried exempting Pawn moves from reduction (next to captures), and it did not make much difference scorewise (still 56% over 120 games). Only the 10% speed advantage disappeared. (It is much easier to do this in uMax, though, as Pawn double moves cannot be tried first even if they are hash moves. They are then tried in second place, which might lead to reduction of the hash move if you do not ad a lot of code. If Pawn moves are never reduced, this problem disappears automatically.)

To my surprise, at 10 times longer time-control, (5 min per game) the result stays about the same: 56.2% (47+ 41= 32-), Pawn moves + captures exempted, Pawn pushing version. I would have expected the advantage to grow with search depth, as LMR is a branching-ratio changing pruning. Apparently there is no extra advantage in thinking further ahead around the PV than in the exhaustive full-width search part beyond a certain number of moves. So I am increasing the minimum remaining depth at which LMR is applied from d=4 to d=5, which decreases the difference n depth between deepest and worst branches.

As far as I know there is no proof that LMR helps more at longer time control.

I know that zappa1.1 and the baron that do not use LMR are 2 of the program that perform better at long time control.

Another point is that you did not play enough games.

I played 100 games match between 2 version of movei with fixed number of nodes and in one match(100000 nodes per move) the result was 60-40 for version A and in another match(200000 nodes per move with the same positions) the result was a win for version B 54-46.

version B also won a match with 400000 nodes per move 53.5-46.5 but lost by a bigger margin with 1,000,000 nodes per move(I do not remember the result and the match was in a different computer but it was better than 55-45 for version A).

Uri

hgm · Post by **hgm** » Sun Apr 08, 2007 7:44 pm

Yes, the statistics is a bit meager, but for 200 games the standard error should be about 2.8%. But I had expected the advatage to increase to 60+%, and this seems to be ruled out with high confidence even with the currnt number of games. But I certainly don't have enough sttistics to decide which of the methods is better. Problem is that the 5 min/game tests take a lot of time. So I am using kind of marginal runs to gat a vague idea of what might work better, and then will concentate on getting better statistics for those. That gives better chances for having something that works available quickly.

hgm · Post by **hgm** » Mon Apr 09, 2007 9:47 am

The remaining depth beyond which LMR is applied does not seem to be a very critical parameter either. I upped it from 4 to 5 (meaning that for good moves, Pawn moves and captures the reply is seached to 4 ply, and for other moves to 3 ply), and the result was 55% (33+ 44=, 23-).

So the light advantage stays, but there is not enough statistic to determie if it changes, and if so, in which direction.

Dann Corbit · Post by **Dann Corbit** » Mon Apr 09, 2007 9:28 pm

IID can be as simple as 3-4 lines of code and should give you good move ordering.

hgm · Post by **hgm** » Mon Apr 09, 2007 10:17 pm

Well, uMax does do IID, but the problem is more that it has no move list to store and sort the moves between generation and execution. The main reason it can be so small is that it immediately performs the moves as the move generator produces them, and all info about the move (from, to, capture square, piece, victim, ep+castling flags) is still available in simple variable. You would either have to store all that in the move list (and access to an array variable A[k] or through a pointer p->A takes 4 characters as opposed to 1 for a simple variable). Or you would have to regenrate most of that info from the stored (from,to) pair only, but that would duplicate a large part of the move generator. And the move geerator is about 60% of uMax already...

So the only thing uMax learns from the IID is the best move, which then become hash move in the next iteration.

So the only thing I have to go on in making the LMR decision is the properties of the moves themselves. The order means nothing. But I don't necessarily consider that a bad thing.

bob · Post by **bob** » Mon Apr 09, 2007 10:38 pm

I suggest you re-run your test several times. You might be amazed at the built-in randomness caused by simply using time as the stopping criterion.

LMR in micro-Max

LMR in micro-Max

Re: LMR in micro-Max

Re: LMR in micro-Max

Re: LMR in micro-Max

Re: LMR in micro-Max

Re: LMR in micro-Max

Re: LMR in micro-Max

Re: LMR in micro-Max

Re: LMR in micro-Max

Re: LMR in micro-Max