Lazy eval

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

mjlef
Posts: 1494
Joined: Thu Mar 30, 2006 2:08 pm

Re: Lazy eval - test results

Post by mjlef »

Houdini wrote:
Milos wrote:
Houdini wrote:You may or may not be right, but that wasn't my point.
It's just funny to see how much of Houdini ideas have already made it into other engines...
This particular idea is something that is below 5Elo and practically unmeasurable.
I guess some ppl copy ideas mechanically following the principle if it doesn't hurt it helps. Others however, require to see measurable benefit.
Different ppl, different approaches ;).

P.S. I wouldn't be too worried. Only 2 ppl capable and willing to RE for ideas (and with their own strong engines) have done it so far ;). Others just go around and ask questions :D.
P.P.S You also made it too easy since everyone has an exact source of Houdini beta ;).
I'm not worried. Like I said before, imitation is the sincerest form of flattery.

I just notice that Houdini's specific implementation of an original idea is being presented by an acclaimed author, without giving any credit.
Unless this has been in Critter from before version 1.0 (Richard RE'd Houdini 1.5 between Critter 0.90 and 1.0), and we've just ended up by a great coincidence using exactly the same code?!

Robert
Sorry, but using the magnitude of the positional part of the eval to disallow LE is a very old idea. I had it in NOW back in 1990 and I see versions of Glaurung used it (in Tord's case, specifically checking if things like king safety eval components were high in the previous ply)
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Lazy eval - test results

Post by bob »

mjlef wrote:
Houdini wrote:
Milos wrote:
Houdini wrote:You may or may not be right, but that wasn't my point.
It's just funny to see how much of Houdini ideas have already made it into other engines...
This particular idea is something that is below 5Elo and practically unmeasurable.
I guess some ppl copy ideas mechanically following the principle if it doesn't hurt it helps. Others however, require to see measurable benefit.
Different ppl, different approaches ;).

P.S. I wouldn't be too worried. Only 2 ppl capable and willing to RE for ideas (and with their own strong engines) have done it so far ;). Others just go around and ask questions :D.
P.P.S You also made it too easy since everyone has an exact source of Houdini beta ;).
I'm not worried. Like I said before, imitation is the sincerest form of flattery.

I just notice that Houdini's specific implementation of an original idea is being presented by an acclaimed author, without giving any credit.
Unless this has been in Critter from before version 1.0 (Richard RE'd Houdini 1.5 between Critter 0.90 and 1.0), and we've just ended up by a great coincidence using exactly the same code?!

Robert
Sorry, but using the magnitude of the positional part of the eval to disallow LE is a very old idea. I had it in NOW back in 1990 and I see versions of Glaurung used it (in Tord's case, specifically checking if things like king safety eval components were high in the previous ply)
You can sort of recognize who is doing what, because if someone has not read a lot and participated in discussions over the past N years, EVERY idea sounds new and original. For those of us that have listened, and read, and tested, and discussed, we realize that most ideas are not new. In the current case, 150 might be somewhat "original" but the concept is anything but...
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Lazy eval - test results

Post by bob »

mjlef wrote:
bob wrote:
bob wrote:
lkaufman wrote:[quote="bobQuestion is, why doesn't it work? The idea is speed. If you do it well, you should be able to see a speed gain that more than offsets the minor accuracy loss. I've been using it in every program I've written since the 70's, The "heavier" the eval, the greater the potential gain... One can even do a "staged" lazy eval with multiple exits, rather than just one... Each exit placed before the next "heavier" piece of the evaluation...
I take it that Crafty does get a decent speedup for lazy eval. Do you get about the same as the 10% figure I quote for Critter, or perhaps more? We can't get more than 2% with any reasonable margin, clearly not enough to use it. The question is whether this is due to redundancy with other pruning techniques, or to some implementation detail? Of course you won't know as you don't work on Komodo, but perhaps you can suggest a way to tell. Sure we do a lot of heavy pruning near the leaf nodes, but so does Critter and it gets a very nice speedup from lazy eval.
I'll try to test later tonight and report the speedup... I'll have to go back to look at exactly what we are doing... we have changed this stuff several times, from one lazy exit, to multiple, and such...
I have two lazy eval exits in my evaluate.c code. I disabled both, and ran three test positions for 30 seconds each. The starting position, a middlegame position, and fine 70 which is a worst-case comparison probably in there are nothing but kings and pawns...

The first output for each pair is the no lazy eval code, the second with with normal lazy eval:

log.001: time=30.08 mat=0 n=46031404 fh=91% nps=1.5M
log.002: time=30.06 mat=0 n=53408994 fh=91% nps=1.8M
improvement: 7.4M additional nodes searched, 16% faster.

log.001: time=30.05 mat=0 n=68478030 fh=95% nps=2.3M
log.002: time=30.11 mat=0 n=91888687 fh=95% nps=3.1M
improvement: 23.4M additional nodes searched, 34% faster

log.001: time=30.10 mat=1 n=129214534 fh=93% nps=4.3M
log.002: time=30.01 mat=1 n=121312909 fh=87% nps=4.0M
improvement: 8M additional nodes searched, 6% faster.

All searches were for 30 seconds, so more nodes is better

In the middlegame, it is a pretty big savings.
I am confused. If "time=30.10" is the time taken to search what I assume is a fixed depth, then there is almost no saving time to completing a search. It seems you LE implementation on these 3 positions does search faster (well in 2 out of 3), but it also takes more nodes for the same depth. COrrect me if I am missing something.

I would love to see the results of a LE vs no-LE match for a few thousand games and see who wins.

I do get a very small elo improvement in my program with LE on, but it is very small.[/quote]

My current testing might show that, once it is finished. I am tweaking the margins again since the eval has changed somewhat, and I always try to "overtweak" in both directions to make sure I get a good inflection point. A really big lazy eval margin effectively turns it off and I have a "2000" test value which should give a good clue here...
lkaufman
Posts: 5960
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA

Re: Lazy eval

Post by lkaufman »

Rebel wrote:
lkaufman wrote: Oh, in your examples above where you wrote "nodes", did you mean "nodes per second"? Then it makes sense. I thought you were doing a fixed depth search and were claiming that the number of nodes visited was cut in half.
Solly, NPS was a mistake :wink: Nodes of course. But here is an example, depth=15

[d]3r2k1/p2r1ppp/1pQ3n1/3P2q1/P3B3/6P1/5P1P/2R1R1K1 w - -

Code: Select all

LE=0.50   0:25  61M nodes
LE=1.00   0:30  70M
LE=1.50   0:33  75M
LE=2.00   0:29  67M   -> surprise here
LE=OFF    1:49 196M
One other question: Do you use the so-called "static null move", where on the last few plies you just take a cutoff if you are above beta by some margin? I suspect that this is somewhat incompatible with lazy eval.
I am using classic Futility pruning. And of course I am using the same LE margin of 0.50 here.
Okay, so in your program lazy eval drastically reduces nodes searched. In everyone else's program, it improves nodes per second, but never reduces nodes searched. You must be counting nodes differently than everyone else. Am I correct in guessing that you only count nodes fully evaluated? Otherwise why would your nodes searched be reduced so much?
mjlef
Posts: 1494
Joined: Thu Mar 30, 2006 2:08 pm

Re: Lazy eval - test results

Post by mjlef »

bob wrote:
mjlef wrote:
Houdini wrote:
Milos wrote:
Houdini wrote:You may or may not be right, but that wasn't my point.
It's just funny to see how much of Houdini ideas have already made it into other engines...
This particular idea is something that is below 5Elo and practically unmeasurable.
I guess some ppl copy ideas mechanically following the principle if it doesn't hurt it helps. Others however, require to see measurable benefit.
Different ppl, different approaches ;).

P.S. I wouldn't be too worried. Only 2 ppl capable and willing to RE for ideas (and with their own strong engines) have done it so far ;). Others just go around and ask questions :D.
P.P.S You also made it too easy since everyone has an exact source of Houdini beta ;).
I'm not worried. Like I said before, imitation is the sincerest form of flattery.

I just notice that Houdini's specific implementation of an original idea is being presented by an acclaimed author, without giving any credit.
Unless this has been in Critter from before version 1.0 (Richard RE'd Houdini 1.5 between Critter 0.90 and 1.0), and we've just ended up by a great coincidence using exactly the same code?!

Robert
Sorry, but using the magnitude of the positional part of the eval to disallow LE is a very old idea. I had it in NOW back in 1990 and I see versions of Glaurung used it (in Tord's case, specifically checking if things like king safety eval components were high in the previous ply)
You can sort of recognize who is doing what, because if someone has not read a lot and participated in discussions over the past N years, EVERY idea sounds new and original. For those of us that have listened, and read, and tested, and discussed, we realize that most ideas are not new. In the current case, 150 might be somewhat "original" but the concept is anything but...
You are right. My margin was not exactly 150, so that is "original".

People discover a lot of the same ideas, especially when they bother to track when something fails in the program. I try not to claim too many ideas as original, since I only have access to the few open source programs I have examined.

I put this in after something simple: marking a position as being futility cut, but doing a full eval anyway, then printing out positions where it failed, then noting how much the final score changed from the estimate. You can generate statistics showing how risky it is to take a cut at various margins. It changed based on depth (it is riskier to do these things near the root so I used a higher margin, and took more risk closer to and in qsearch). Along the same line I experimented with a self correcting model. Near the root and in PV nodes I would always do a full eval. So I was able to calculate the average and worst case4 margins found so far in the search, then dynamically adjust it. I would only adjust between iterations. Seemed OK although I suppose the search was probably unstable so I took that out.

I still need some better flags for moves that might effect king safety a lot, but the simple things I have tried so far do not seem to matter very much.
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: Lazy eval - test results

Post by Don »

mjlef wrote:
Houdini wrote:
Milos wrote:
Houdini wrote:You may or may not be right, but that wasn't my point.
It's just funny to see how much of Houdini ideas have already made it into other engines...
This particular idea is something that is below 5Elo and practically unmeasurable.
I guess some ppl copy ideas mechanically following the principle if it doesn't hurt it helps. Others however, require to see measurable benefit.
Different ppl, different approaches ;).

P.S. I wouldn't be too worried. Only 2 ppl capable and willing to RE for ideas (and with their own strong engines) have done it so far ;). Others just go around and ask questions :D.
P.P.S You also made it too easy since everyone has an exact source of Houdini beta ;).
I'm not worried. Like I said before, imitation is the sincerest form of flattery.

I just notice that Houdini's specific implementation of an original idea is being presented by an acclaimed author, without giving any credit.
Unless this has been in Critter from before version 1.0 (Richard RE'd Houdini 1.5 between Critter 0.90 and 1.0), and we've just ended up by a great coincidence using exactly the same code?!

Robert
Sorry, but using the magnitude of the positional part of the eval to disallow LE is a very old idea. I had it in NOW back in 1990 and I see versions of Glaurung used it (in Tord's case, specifically checking if things like king safety eval components were high in the previous ply)
This is not exactly a surprise to anyone that Houdart would try to claim credit for someone else's idea. Maybe it's the 150 margin that he thinks is particularly brilliant.
User avatar
Rebel
Posts: 6995
Joined: Thu Aug 18, 2011 12:04 pm

Re: Lazy eval

Post by Rebel »

lkaufman wrote:
Rebel wrote:
lkaufman wrote: Oh, in your examples above where you wrote "nodes", did you mean "nodes per second"? Then it makes sense. I thought you were doing a fixed depth search and were claiming that the number of nodes visited was cut in half.
Solly, NPS was a mistake :wink: Nodes of course. But here is an example, depth=15

[d]3r2k1/p2r1ppp/1pQ3n1/3P2q1/P3B3/6P1/5P1P/2R1R1K1 w - -

Code: Select all

LE=0.50   0:25  61M nodes
LE=1.00   0:30  70M
LE=1.50   0:33  75M
LE=2.00   0:29  67M   -> surprise here
LE=OFF    1:49 196M
One other question: Do you use the so-called "static null move", where on the last few plies you just take a cutoff if you are above beta by some margin? I suspect that this is somewhat incompatible with lazy eval.
I am using classic Futility pruning. And of course I am using the same LE margin of 0.50 here.
Okay, so in your program lazy eval drastically reduces nodes searched. In everyone else's program, it improves nodes per second, but never reduces nodes searched. You must be counting nodes differently than everyone else. Am I correct in guessing that you only count nodes fully evaluated? Otherwise why would your nodes searched be reduced so much?
Look at the timings I included, the ratio time/nodes is in harmony. And yes, an LE is counted as a node, NPS is maintained in search not in eval.
kranium
Posts: 2129
Joined: Thu May 29, 2008 10:43 am

Re: Lazy eval - test results

Post by kranium »

rvida wrote: Critter uses 150 cp lazy margin in qsearch and 300 cp margin at interior nodes.
Houdini wrote: I just notice that Houdini's specific implementation of an original idea is being presented by an acclaimed author, without giving any credit.
Unless this has been in Critter from before version 1.0 (Richard RE'd Houdini 1.5 between Critter 0.90 and 1.0), and we've just ended up by a great coincidence using exactly the same code?!
Robert
mjlef wrote: Sorry, but using the magnitude of the positional part of the eval to disallow LE is a very old idea. I had it in NOW back in 1990 and I see versions of Glaurung used it (in Tord's case, specifically checking if things like king safety eval components were high in the previous ply)
Don wrote: This is not exactly a surprise to anyone that Houdart would try to claim credit for someone else's idea. Maybe it's the 150 margin that he thinks is particularly brilliant.
150 is certainly not an 'original' or 'brilliant' idea,
Ippolit, Robbolito, Igorrit, and IvanHoe all use 150, exactly in the manner Richard described above for Critter...

from common.c:
#define LazyValue 150
#define LazyValue2 300
Sune_F

Re: Lazy eval - test results

Post by Sune_F »

Don wrote:
mcostalba wrote:
lkaufman wrote: I note that SF rejected lazy eval. So at least one other top program failed to demonstrate a benefit from it. Why didn't it help Stockfish? I'm sure they would not have rejected a 10% nearly-free speedup.
For a very selective search you may want to have accurate eval info to push pruning to the limits. In case of Crafty, where the pruning is far lower I guess given the 300 ELO gap or so, my guess is the speed up plays a factor more than accurate eval. Another point are the spikes in king evaluations, present in SF and, as you mentioned also in Komodo, these are filtered out with lazy eval, but play an important role for pruning decision.

I still should have the SF lazy eval patch somewhere, I can post it if someone is interested in experimenting with this.
Komodo pushes the selectivity pretty hard and we DO get a decent nodes per second increase with Lazy evaluation but the problem is that we get a big increase in nodes. It is this way because we assume we will not make scout if the guesstimate is too low. So we miss some of the beta cutoffs, you cannot have your cake and eat it too. Komodo's positional component can vary enormously so we do take a lot of damage positionally when we use lazy margins.

If I do it the safer way, we do not get much of a speedup at all but it doesn't hurt the strength much either. The safer way is to take the cutoff if the pessimistic estimate is still above beta, otherwise call the evaluation and try to get the cutoff.

I don't think it is going to work for us because we push the futility and margins and tricks too hard and our pawn structure and king safety are aggressive. If we had done lazy evaluation in the Doch days, it probably would have worked well and our program would have evolved differently.
The way I solved this was to pass alpha, beta to the evaluation. Occasionally I need very accurate eval for pruning or whatever. In such cases I just call eval(-mate, mate), that will turn off lazy eval (or rather prevent it).
You can have your cake and it too, just half and half.
lkaufman
Posts: 5960
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA

Re: Lazy eval

Post by lkaufman »

Rebel wrote:
lkaufman wrote:
Rebel wrote:
lkaufman wrote: Oh, in your examples above where you wrote "nodes", did you mean "nodes per second"? Then it makes sense. I thought you were doing a fixed depth search and were claiming that the number of nodes visited was cut in half.
Solly, NPS was a mistake :wink: Nodes of course. But here is an example, depth=15

[d]3r2k1/p2r1ppp/1pQ3n1/3P2q1/P3B3/6P1/5P1P/2R1R1K1 w - -

Code: Select all

LE=0.50   0:25  61M nodes
LE=1.00   0:30  70M
LE=1.50   0:33  75M
LE=2.00   0:29  67M   -> surprise here
LE=OFF    1:49 196M
One other question: Do you use the so-called "static null move", where on the last few plies you just take a cutoff if you are above beta by some margin? I suspect that this is somewhat incompatible with lazy eval.
I am using classic Futility pruning. And of course I am using the same LE margin of 0.50 here.
Okay, so in your program lazy eval drastically reduces nodes searched. In everyone else's program, it improves nodes per second, but never reduces nodes searched. You must be counting nodes differently than everyone else. Am I correct in guessing that you only count nodes fully evaluated? Otherwise why would your nodes searched be reduced so much?
Look at the timings I included, the ratio time/nodes is in harmony. And yes, an LE is counted as a node, NPS is maintained in search not in eval.
Yes, I see that NPS is fairly constant, nodes searched are heavily reduced. The question is WHY? Normal behavior of lazy eval is to improve NPS by reducing eval time, not to reduce nodes searched. Just what nodes are cut out by your lazy eval? Why does your program act differently in this matter compared to all others?