Lazy eval

bob · Post by **bob** » Sun Dec 04, 2011 12:42 am

lkaufman wrote:
bob wrote:
lkaufman wrote:I posted the speedup numbers above (first post with - test results in subject). Roughly 33% faster in middlegame, less in opening, much less in a king and pawn only ending...

Our cutoff bound is dynamic, but is typically between a minor piece and a rook, 300 - 500, for the first cutoff which is right at the top of evaluate. If that doesn't work, we hit the pawn evaluation (and passed pawn evaluation) and then try another lazy eval cutoff. The second cutoff uses a dynamic value, but it is roughly 1.5 pawns...
Thanks for the data. However it would be much more informative to run the searches to a fixed depth rather than for 30 seconds. The point is that lazy eval seems to expand the tree, so although you may get 33% more NPS (which is very good), much or even all of this could be wasted if you need more nodes for a given depth, as everyone seems to be reporting. Today we did get a decent NPS speedup (nothing like yours though), but it mostly went away when we look at nodes to complete N ply.
Here goes: (I had looked at the output to verify that the tree shape was not changing significantly, however) But this pretty well shows that:

starting position:
log.001: time=42.77 mat=0 n=65951949 fh=91% nps=1.5M
log.002: time=37.17 mat=0 n=65945250 fh=91% nps=1.8M

MG:
log.001: time=52.16 mat=0 n=121221442 fh=95% nps=2.3M
log.002: time=40.20 mat=0 n=121147943 fh=95% nps=3.0M

EG:
log.001: time=11.36 mat=1 n=42658591 fh=91% nps=3.8M
log.002: time=10.33 mat=1 n=42613588 fh=90% nps=4.1M

Thanks. So at least based on these positions it seems that your implementation of lazy eval is essentially a virtually free, significant speedup. What margins was this based on, and which speedups are you currently using on the last four plies? I'm sure you do futility, do you also just look at the best N moves on these plies, and do you do "static null move" where you just take the cutoff if above beta by more than N centipawns on those plies? All of the top programs do the above. If you also do them, it becomes difficult to understand why you get so much out of lazy eval while Stockfish (and Komodo) get nothing.[/quote]

The margins I had given previously. But remember that we have more than one lazy exit, with bigger margins up front, lower margins later...

We do classic futility pruning in last 4 plies, plus the usual reductions for those that are not pruned...

Our margins are pretty safe, if you notice how little the node counts change...

lkaufman · Post by **lkaufman** » Sun Dec 04, 2011 2:40 am

bob wrote: We do classic futility pruning in last 4 plies, plus the usual reductions for those that are not pruned...

Our margins are pretty safe, if you notice how little the node counts change...

You don't specifically state so, but am I correct in assuming from your answer that:

1. Crafty looks at all legal moves on the last four plies if above the futility margin, even if some are reduced? In other words, you do NOT use move-count pruning (prune after N moves), correct?
2. Crafty does not do static null move, meaning no matter how far above beta you are on the last few plies, you don't allow a stand-pat cutoff if null move does not apply? Is that correct?

If you don't do these things, it suggests that one or both might explain why Stockfish and Komodo don't benefit from lazy eval while you did. However, it would not explain why Critter and Ippos do benefit, as they also do these things, though with some differences.

lkaufman · Post by **lkaufman** » Sun Dec 04, 2011 4:46 am

Rebel wrote:
lkaufman wrote: Actually, that description seems to match what we tried very closely! Ed reports a 3.2 speedup (i.e. 220%) while we get barely over 2%! So it works 100 times better for him than for us, even though we both have complex king safety with large possible scores. Very, very strange!!
If you have ProDeo change the LE parameter:

[Chess Knowledge = 100] to 500

LE is more or less deactivated then.

Then check the differences between 100 and 500.

I quickly ran 2 positions:

1) 100: 22M nodes vs 500: 47M nodes
2) 100: 21M nodes vs 500: 55M nodes

What version of ProDeo is this? I put version 1.73 on Fritz interface, but I don't see an option that is called "Chess Knowledge" or any that have numerical values at all.

In any case, I'll take your word for it that you get a huge node reduction. Your program must be quite different than all the others talked about here, because everyone else gets either no significant change in nodes or a clear increase, but instead everyone gets improved nodes-per-second. Perhaps you just count nodes differently? For example only counting nodes fully evaluated? Are these node figures accurate reflections of time saved?

Rebel · Post by **Rebel** » Sun Dec 04, 2011 5:25 pm

lkaufman wrote:
Rebel wrote:
lkaufman wrote: Actually, that description seems to match what we tried very closely! Ed reports a 3.2 speedup (i.e. 220%) while we get barely over 2%! So it works 100 times better for him than for us, even though we both have complex king safety with large possible scores. Very, very strange!!
If you have ProDeo change the LE parameter:

[Chess Knowledge = 100] to 500

LE is more or less deactivated then.

Then check the differences between 100 and 500.

I quickly ran 2 positions:

1) 100: 22M nodes vs 500: 47M nodes
2) 100: 21M nodes vs 500: 55M nodes
What version of ProDeo is this? I put version 1.73 on Fritz interface, but I don't see an option that is called "Chess Knowledge" or any that have numerical values at all.

In any case, I'll take your word for it that you get a huge node reduction. Your program must be quite different than all the others talked about here, because everyone else gets either no significant change in nodes or a clear increase, but instead everyone gets improved nodes-per-second. Perhaps you just count nodes differently? For example only counting nodes fully evaluated? Are these node figures accurate reflections of time saved?

1. You need to look at the loaded personailty -> ProDeo.eng in the PERSONAL folder to find and modify the [Chess Knowledge = 100] parameter.

2. Remember the trick is to have the eval score of RD=1 (horizon depth -1) availbable. The LE/QS margin becomes eval_score_depth-1 + 0.50

3. Yes, the NPS numbers are real.

lkaufman · Post by **lkaufman** » Sun Dec 04, 2011 6:25 pm

Rebel wrote:
lkaufman wrote:
Rebel wrote:
lkaufman wrote: Actually, that description seems to match what we tried very closely! Ed reports a 3.2 speedup (i.e. 220%) while we get barely over 2%! So it works 100 times better for him than for us, even though we both have complex king safety with large possible scores. Very, very strange!!
If you have ProDeo change the LE parameter:

[Chess Knowledge = 100] to 500

LE is more or less deactivated then.

Then check the differences between 100 and 500.

I quickly ran 2 positions:

1) 100: 22M nodes vs 500: 47M nodes
2) 100: 21M nodes vs 500: 55M nodes
What version of ProDeo is this? I put version 1.73 on Fritz interface, but I don't see an option that is called "Chess Knowledge" or any that have numerical values at all.

In any case, I'll take your word for it that you get a huge node reduction. Your program must be quite different than all the others talked about here, because everyone else gets either no significant change in nodes or a clear increase, but instead everyone gets improved nodes-per-second. Perhaps you just count nodes differently? For example only counting nodes fully evaluated? Are these node figures accurate reflections of time saved?
1. You need to look at the loaded personailty -> ProDeo.eng in the PERSONAL folder to find and modify the [Chess Knowledge = 100] parameter.

2. Remember the trick is to have the eval score of RD=1 (horizon depth -1) availbable. The LE/QS margin becomes eval_score_depth-1 + 0.50

3. Yes, the NPS numbers are real.

Oh, in your examples above where you wrote "nodes", did you mean "nodes per second"? Then it makes sense. I thought you were doing a fixed depth search and were claiming that the number of nodes visited was cut in half.
One other question: Do you use the so-called "static null move", where on the last few plies you just take a cutoff if you are above beta by some margin? I suspect that this is somewhat incompatible with lazy eval.

petero2 · Post by **petero2** » Sun Dec 04, 2011 7:26 pm

zamar wrote:
lkaufman wrote: OK, this makes sense, since your comments about king safety in SF apply also to Komodo. But we should at least get a sizable speedup with a reasonable margin, even if it was not a favorable trade-off. But we don't. Did SF show the same behavior?
It has been so long time since we tested this, that I can't remember.

But "speed-up" is a bit ambiguous concept. I think you definitely should see higher nodes/second ratio. However my experience is that worse eval often leads to bigger search trees, so "go depth x" might require more nodes.

P.S. Of course if you flood-fill your eval with "if"-statements to check lazy-eval everywhere, you might increase branching which could cause slowdown, but I don't think that this could realistically be an issue in any sensible implementation.

Are there engines that combine lazy eval with storing of eval scores in the transposition table? If there are, what do they store in the transposition table when lazy eval is used? I suppose they either have to not store lazy eval results or also store evaluation bounds.

Rebel · Post by **Rebel** » Mon Dec 05, 2011 8:53 am

lkaufman wrote: Oh, in your examples above where you wrote "nodes", did you mean "nodes per second"? Then it makes sense. I thought you were doing a fixed depth search and were claiming that the number of nodes visited was cut in half.

Solly, NPS was a mistake

Nodes of course. But here is an example, depth=15

[d]3r2k1/p2r1ppp/1pQ3n1/3P2q1/P3B3/6P1/5P1P/2R1R1K1 w - -

Code: Select all

LE=0.50   0:25  61M nodes
LE=1.00   0:30  70M
LE=1.50   0:33  75M
LE=2.00   0:29  67M   -> surprise here
LE=OFF    1:49 196M

One other question: Do you use the so-called "static null move", where on the last few plies you just take a cutoff if you are above beta by some margin? I suspect that this is somewhat incompatible with lazy eval.

I am using classic Futility pruning. And of course I am using the same LE margin of 0.50 here.

bob · Post by **bob** » Mon Dec 05, 2011 3:23 pm

lkaufman wrote:
bob wrote: We do classic futility pruning in last 4 plies, plus the usual reductions for those that are not pruned...

Our margins are pretty safe, if you notice how little the node counts change...
You don't specifically state so, but am I correct in assuming from your answer that:

1. Crafty looks at all legal moves on the last four plies if above the futility margin, even if some are reduced? In other words, you do NOT use move-count pruning (prune after N moves), correct?
2. Crafty does not do static null move, meaning no matter how far above beta you are on the last few plies, you don't allow a stand-pat cutoff if null move does not apply? Is that correct?

If you don't do these things, it suggests that one or both might explain why Stockfish and Komodo don't benefit from lazy eval while you did. However, it would not explain why Critter and Ippos do benefit, as they also do these things, though with some differences.

No, I do not throw out moves based just on move count...

I am not sure about your last question. It would seem that if you are way above beta, you are going to get a cutoff quickly, no matter what happens, even if it is two plies down in the tree where a real stand-pat can happen... But I do not do the classic static null-move idea. Tried it a long time back (several years) but have not tested it in the last couple of years at least...

mjlef · Post by **mjlef** » Mon Dec 05, 2011 4:11 pm

bob wrote:
bob wrote:
lkaufman wrote:[quote="bobQuestion is, why doesn't it work? The idea is speed. If you do it well, you should be able to see a speed gain that more than offsets the minor accuracy loss. I've been using it in every program I've written since the 70's, The "heavier" the eval, the greater the potential gain... One can even do a "staged" lazy eval with multiple exits, rather than just one... Each exit placed before the next "heavier" piece of the evaluation...
I take it that Crafty does get a decent speedup for lazy eval. Do you get about the same as the 10% figure I quote for Critter, or perhaps more? We can't get more than 2% with any reasonable margin, clearly not enough to use it. The question is whether this is due to redundancy with other pruning techniques, or to some implementation detail? Of course you won't know as you don't work on Komodo, but perhaps you can suggest a way to tell. Sure we do a lot of heavy pruning near the leaf nodes, but so does Critter and it gets a very nice speedup from lazy eval.
I'll try to test later tonight and report the speedup... I'll have to go back to look at exactly what we are doing... we have changed this stuff several times, from one lazy exit, to multiple, and such...

I have two lazy eval exits in my evaluate.c code. I disabled both, and ran three test positions for 30 seconds each. The starting position, a middlegame position, and fine 70 which is a worst-case comparison probably in there are nothing but kings and pawns...

The first output for each pair is the no lazy eval code, the second with with normal lazy eval:

log.001: time=30.08 mat=0 n=46031404 fh=91% nps=1.5M
log.002: time=30.06 mat=0 n=53408994 fh=91% nps=1.8M
improvement: 7.4M additional nodes searched, 16% faster.

log.001: time=30.05 mat=0 n=68478030 fh=95% nps=2.3M
log.002: time=30.11 mat=0 n=91888687 fh=95% nps=3.1M
improvement: 23.4M additional nodes searched, 34% faster

log.001: time=30.10 mat=1 n=129214534 fh=93% nps=4.3M
log.002: time=30.01 mat=1 n=121312909 fh=87% nps=4.0M
improvement: 8M additional nodes searched, 6% faster.

All searches were for 30 seconds, so more nodes is better

In the middlegame, it is a pretty big savings.[/quote]

I am confused. If "time=30.10" is the time taken to search what I assume is a fixed depth, then there is almost no saving time to completing a search. It seems you LE implementation on these 3 positions does search faster (well in 2 out of 3), but it also takes more nodes for the same depth. COrrect me if I am missing something.

I would love to see the results of a LE vs no-LE match for a few thousand games and see who wins.

I do get a very small elo improvement in my program with LE on, but it is very small.

mjlef · Post by **mjlef** » Mon Dec 05, 2011 4:20 pm

Don wrote:
mcostalba wrote:
lkaufman wrote: I note that SF rejected lazy eval. So at least one other top program failed to demonstrate a benefit from it. Why didn't it help Stockfish? I'm sure they would not have rejected a 10% nearly-free speedup.
For a very selective search you may want to have accurate eval info to push pruning to the limits. In case of Crafty, where the pruning is far lower I guess given the 300 ELO gap or so, my guess is the speed up plays a factor more than accurate eval. Another point are the spikes in king evaluations, present in SF and, as you mentioned also in Komodo, these are filtered out with lazy eval, but play an important role for pruning decision.

I still should have the SF lazy eval patch somewhere, I can post it if someone is interested in experimenting with this.
Komodo pushes the selectivity pretty hard and we DO get a decent nodes per second increase with Lazy evaluation but the problem is that we get a big increase in nodes. It is this way because we assume we will not make scout if the guesstimate is too low. So we miss some of the beta cutoffs, you cannot have your cake and eat it too. Komodo's positional component can vary enormously so we do take a lot of damage positionally when we use lazy margins.

If I do it the safer way, we do not get much of a speedup at all but it doesn't hurt the strength much either. The safer way is to take the cutoff if the pessimistic estimate is still above beta, otherwise call the evaluation and try to get the cutoff.

I don't think it is going to work for us because we push the futility and margins and tricks too hard and our pawn structure and king safety are aggressive. If we had done lazy evaluation in the Doch days, it probably would have worked well and our program would have evolved differently.

I don't see anything that cannot be explained for us here. It's pretty easy to see what is happening - lower margins, more nodes search because more missed cutoffs and compromised evaluation function takes away some ELO. Not a good tradeoff.

I think it's partly because Komodo is a lean program. We went through a phase after we saw how slow Komodo's evaluation was where we tried hard to avoid moving to the next stack frame (where evaluation would be called as one of the first things) by doing lazy-like things.

For me, King Safety and Passed Pawns are the biggest terms that hurt LE. I now always do a full passed pawn eval (which hardly effect node rates since the basic pawn eval is hashed and I save a byte with a bit site for each passed pawn for each side, so I can quickly calculate the things involving king position and such). I do my LE after passed pawn, trapped pieces but before king safety and mobility. Also, I always do an eval at each internal node, and save it so the estimate is based on that value and piece square table, pawn eval, passed pawns, trapped pieces and some drawing rules. If you have not tried always doing a passed pawn eval, it might be worth a try.

Lazy eval

Re: Lazy eval - test results

Re: Lazy eval - test results

Re: Lazy eval

Re: Lazy eval

Re: Lazy eval

Re: Lazy eval

Re: Lazy eval

Re: Lazy eval - test results

Re: Lazy eval - test results

Re: Lazy eval - test results