Lazy eval

bob · Post by **bob** » Wed Nov 30, 2011 6:56 pm

tpetzke wrote:
All searches were for 30 seconds, so more nodes is better
Is it really possible to state that ? Should the quality of the result not also influence the statement about the final outcome.

Even if both searches return the same move after 30 sec, maybe the slower search also required less nodes to find this move.

Thomas...

I compared depths, nodes and times. I can post the complete logs if you want to see. LE should not affect the QUALITY one iota, if it is done correctly. It only affects the speed. If you use too large a margin, it is ineffective as nothing will take the lazy exit. If you use too small a margin, you get errors by exiting when you should not. My two thresholds were tuned carefully, meaning that there is no significant "loss of accuracy" while returning a significant "gain of speed."

Engin · Post by **Engin** » Thu Dec 01, 2011 1:41 am

i had many times experimented with and without lazy eval in the past, and i was danced between to use or not LE.

with LE you become of course a speed up , but on the other side if you using very agressive pruning and LMR in search this doesn' t help you to reduce the tree size on the end of search. So my mind is you have to do more nodes to search with LE, but without you can pruning more moves in search if you eval is more accurate. The other dangarous is if you use a complex king safety eval and return a margin back for pruning decisions, so this can not return back because you leaf the eval before king safety is done.

So i decide now not to use Lazy Eval ! .

lkaufman · Post by **lkaufman** » Thu Dec 01, 2011 2:55 am

bob wrote: I compared depths, nodes and times. I can post the complete logs if you want to see. LE should not affect the QUALITY one iota, if it is done correctly. It only affects the speed. If you use too large a margin, it is ineffective as nothing will take the lazy exit. If you use too small a margin, you get errors by exiting when you should not. My two thresholds were tuned carefully, meaning that there is no significant "loss of accuracy" while returning a significant "gain of speed."

If we use margins big enough to avoid any quality loss, we get no speedup at all. Don't you use king safety scores that can run up to several pawns at times? We do, and so lazy eval will always cost us something with any reasonable margin. But with margins of around half a pawn, we only get about a 2% speedup (roughly 4% more nps but a 2% increase in nodes) and our quality clearly suffers too much for that. Can you check what speedup you get, and with what margins?

Milos · Post by **Milos** » Thu Dec 01, 2011 3:28 am

lkaufman wrote:But with margins of around half a pawn, we only get about a 2% speedup (roughly 4% more nps but a 2% increase in nodes) and our quality clearly suffers too much for that. Can you check what speedup you get, and with what margins?

This is clearly impossible. You must have an error in implementation.
The usual speed saving is 10% for 2-3 pawns margin. For under 100cp margin you should get well over 15%.

lkaufman · Post by **lkaufman** » Thu Dec 01, 2011 3:44 am

Milos wrote:
lkaufman wrote:But with margins of around half a pawn, we only get about a 2% speedup (roughly 4% more nps but a 2% increase in nodes) and our quality clearly suffers too much for that. Can you check what speedup you get, and with what margins?
This is clearly impossible. You must have an error in implementation.
The usual speed saving is 10% for 2-3 pawns margin. For under 100cp margin you should get well over 15%.

Well, Critter gets about 10% for a 60 cp margin, so not all programs get as much as you say. But our figures are indeed suspiciously low. Still, it's possible that other things we do are so similar to lazy eval that this could be the explanation. We'll have to investigate further.

Milos · Post by **Milos** » Thu Dec 01, 2011 4:01 am

lkaufman wrote:Well, Critter gets about 10% for a 60 cp margin, so not all programs get as much as you say. But our figures are indeed suspiciously low. Still, it's possible that other things we do are so similar to lazy eval that this could be the explanation. We'll have to investigate further.

If you measure the speedup as nps/total_num_nodes then Critter numbers are spot on. The numbers I gave in the previous comment are just pure nps speedup.
The best way to check frequency of lazy eval is to track the average time your engine spends doing the evaluation. Just use some profiler for this.

bob · Post by **bob** » Thu Dec 01, 2011 4:04 am

lkaufman wrote:
bob wrote: I compared depths, nodes and times. I can post the complete logs if you want to see. LE should not affect the QUALITY one iota, if it is done correctly. It only affects the speed. If you use too large a margin, it is ineffective as nothing will take the lazy exit. If you use too small a margin, you get errors by exiting when you should not. My two thresholds were tuned carefully, meaning that there is no significant "loss of accuracy" while returning a significant "gain of speed."
If we use margins big enough to avoid any quality loss, we get no speedup at all. Don't you use king safety scores that can run up to several pawns at times? We do, and so lazy eval will always cost us something with any reasonable margin. But with margins of around half a pawn, we only get about a 2% speedup (roughly 4% more nps but a 2% increase in nodes) and our quality clearly suffers too much for that. Can you check what speedup you get, and with what margins?

I posted the speedup numbers above (first post with - test results in subject). Roughly 33% faster in middlegame, less in opening, much less in a king and pawn only ending...

Our cutoff bound is dynamic, but is typically between a minor piece and a rook, 300 - 500, for the first cutoff which is right at the top of evaluate. If that doesn't work, we hit the pawn evaluation (and passed pawn evaluation) and then try another lazy eval cutoff. The second cutoff uses a dynamic value, but it is roughly 1.5 pawns...

bob · Post by **bob** » Thu Dec 01, 2011 4:07 am

lkaufman wrote:
Milos wrote:
lkaufman wrote:But with margins of around half a pawn, we only get about a 2% speedup (roughly 4% more nps but a 2% increase in nodes) and our quality clearly suffers too much for that. Can you check what speedup you get, and with what margins?
This is clearly impossible. You must have an error in implementation.
The usual speed saving is 10% for 2-3 pawns margin. For under 100cp margin you should get well over 15%.
Well, Critter gets about 10% for a 60 cp margin, so not all programs get as much as you say. But our figures are indeed suspiciously low. Still, it's possible that other things we do are so similar to lazy eval that this could be the explanation. We'll have to investigate further.

When you profile, what percent of the time do you spend in evaluate? For Crafty, it is about 50%. If I did nothing but material-only, the nps would approximately double. In the middlegame, LE gives me a 33% speed increase, if you look at the numbers I posted. Or, another way, 133% faster vs the potential 200% faster. That would suggest something like maybe 2/3 of the evals do not take a lazy exit, although this is a gross guess since I have two different LE exit points...

lkaufman · Post by **lkaufman** » Thu Dec 01, 2011 4:34 am

Milos wrote:
lkaufman wrote:Well, Critter gets about 10% for a 60 cp margin, so not all programs get as much as you say. But our figures are indeed suspiciously low. Still, it's possible that other things we do are so similar to lazy eval that this could be the explanation. We'll have to investigate further.
If you measure the speedup as nps/total_num_nodes then Critter numbers are spot on. The numbers I gave in the previous comment are just pure nps speedup.
The best way to check frequency of lazy eval is to track the average time your engine spends doing the evaluation. Just use some profiler for this.

Our program spends more time on eval than most, I think it's over half the time. So lazy eval should be a big win. But it doesn't seem to kick in nearly as often as it must be doing in other programs. The puzzle is whether it's due to other things we do or to wrong implementation of the idea.

I note that SF rejected lazy eval. So at least one other top program failed to demonstrate a benefit from it. Why didn't it help Stockfish? I'm sure they would not have rejected a 10% nearly-free speedup.

mcostalba · Post by **mcostalba** » Thu Dec 01, 2011 7:38 am

lkaufman wrote: I note that SF rejected lazy eval. So at least one other top program failed to demonstrate a benefit from it. Why didn't it help Stockfish? I'm sure they would not have rejected a 10% nearly-free speedup.

For a very selective search you may want to have accurate eval info to push pruning to the limits. In case of Crafty, where the pruning is far lower I guess given the 300 ELO gap or so, my guess is the speed up plays a factor more than accurate eval. Another point are the spikes in king evaluations, present in SF and, as you mentioned also in Komodo, these are filtered out with lazy eval, but play an important role for pruning decision.

I still should have the SF lazy eval patch somewhere, I can post it if someone is interested in experimenting with this.

Lazy eval

Re: Lazy eval - test results

Re: Lazy eval

Re: Lazy eval - test results

Re: Lazy eval - test results

Re: Lazy eval - test results

Re: Lazy eval - test results

Re: Lazy eval - test results

Re: Lazy eval - test results

Re: Lazy eval - test results

Re: Lazy eval - test results