selective depth definition

Uri Blass · Post by **Uri Blass** » Fri Feb 14, 2014 11:13 am

syzygy wrote:
hgm wrote:
syzygy wrote:It makes sense to only measure selective search depth in PV nodes. It saves some cpu cycli and reporting selective search depth is not much more than a gimmick anyway.
This is weird logic. Basically you are saying that in stead of printing X, it makes sense to print another quantity Y because you can calculate that faster, and X was not very important anyway. Would it make sense to print 3.1415927 in stead of the node count, just because it saves cycles?

Printing seldepth is not an obligation. If you don't want to spend the cycles it would take to calculate it, just don't print it. That seems to make more sense to me than starting to lie about it.
It's not a lie, it's just a particular definition of selective search depth. It still gives some indication of the longest lines that are being considered. And in a way only the PV lines are really lines that are being considered, the rest consists of lines that are being refuted.

Of course the reported value is not very useful for debugging purposes and/or knowing how big internal arrays should be (which seems to be what Uri wants). But there are debugging modes for that.

In other words you say that line that is refuted is not line that is considered.

I do not agree.
If I play chess and calculate some interesting sacrifice and decide that it is not good for me because the opponent has a defence then I clearly considered the sacrifice in my calculations.

syzygy · Post by **syzygy** » Fri Feb 14, 2014 11:18 am

AlvaroBegue wrote:
bob wrote:
AlvaroBegue wrote:
syzygy wrote:It makes sense to only measure selective search depth in PV nodes. It saves some cpu cycli and reporting selective search depth is not much more than a gimmick anyway.
It saves cycles? What evidence do you have that it saves cycles?
That's too easy. EVERY compare, jump less than, and mov instruction group takes cycles. If you don't execute those instructions you definitely save the cycles they would have burned. Might not be a lot, but it is certainly greater than zero.
No, not really. With instruction-level parallelism it is sometimes possible to execute some instructions for free, if they can be decoded and executed at the same time as some other instruction.

Actually, if PvNode weren't a compile-time constant (which it is, because it can be deduced from template parameters), the proposed code would likely be faster, because the check for `PvNode' would involve a hard-to-predict branch.

So it's possibly true that the original code is faster, but it is not obvious to me. Given how complex modern processors and compilers are, the only way to tell if something is slower is to measure it.

For a particular compiled executable on a particular processor there is a small chance that indeed it does not cost any cycles. But compile again, with another compiler version or slightly modified source code, or run it on a different processor, and it will save cycles.

Removing instructions is a pretty safe route to saving cpu cycles. It's not something that you need to test. In fact, it is BETTER not to test it, because a very sensible modification like this (from the point of view of saving cpu cycles) could even turn out to result in slower code in a particular instance. In a particular instance, it could be that the extra code causes some completely unrelated code to be better aligned, for example.

The point is, you cannot and should not count on such compiler (bad) luck. The rule of thumb is that reducing the number of instructions to be executed saves processor cycli. Applying this rule of thumb, this rule of common sense, will lead to more efficient code in the long run.

Obviously my answer took into account that PvNode is a compile-time constant.

Uri Blass · Post by **Uri Blass** » Fri Feb 14, 2014 11:31 am

syzygy wrote:
AlvaroBegue wrote:
bob wrote:
AlvaroBegue wrote:
syzygy wrote:It makes sense to only measure selective search depth in PV nodes. It saves some cpu cycli and reporting selective search depth is not much more than a gimmick anyway.
It saves cycles? What evidence do you have that it saves cycles?
That's too easy. EVERY compare, jump less than, and mov instruction group takes cycles. If you don't execute those instructions you definitely save the cycles they would have burned. Might not be a lot, but it is certainly greater than zero.
No, not really. With instruction-level parallelism it is sometimes possible to execute some instructions for free, if they can be decoded and executed at the same time as some other instruction.

Actually, if PvNode weren't a compile-time constant (which it is, because it can be deduced from template parameters), the proposed code would likely be faster, because the check for `PvNode' would involve a hard-to-predict branch.

So it's possibly true that the original code is faster, but it is not obvious to me. Given how complex modern processors and compilers are, the only way to tell if something is slower is to measure it.
For a particular compiled executable on a particular processor there is a small chance that indeed it does not cost any cycles. But compile again, with another compiler version or slightly modified source code, or run it on a different processor, and it will save cycles.

Removing instructions is a pretty safe route to saving cpu cycles. It's not something that you need to test. In fact, it is BETTER not to test it, because a very sensible modification like this (from the point of view of saving cpu cycles) could even turn out to result in slower code in a particular instance. In a particular instance, it could be that the extra code causes some completely unrelated code to be better aligned, for example.

The point is, you cannot and should not count on such compiler (bad) luck. The rule of thumb is that reducing the number of instructions to be executed saves processor cycli. Applying this rule of thumb, this rule of common sense, will lead to more efficient code in the long run.

Obviously my answer took into account that PvNode is a compile-time constant.

I am no expert about speed so I say nothing but I wonder what is the difference between it and endgame knowledge and what is the reason that you think that endgame knowledge inside the evaluation like KBN vs K has zero cost
when the program need to check if a position is KBN vs K during the search.

syzygy · Post by **syzygy** » Fri Feb 14, 2014 1:29 pm

Uri Blass wrote:I am no expert about speed so I say nothing but I wonder what is the difference between it and endgame knowledge and what is the reason that you think that endgame knowledge inside the evaluation like KBN vs K has zero cost
when the program need to check if a position is KBN vs K during the search.

The answer is in the code:

Code: Select all

  // If we have a specialized evaluation function for the current material
  // configuration, call it and return.
  if (ei.mi->specialized_eval_exists())
      return ei.mi->evaluate(pos);

The condition simply checks whether the pointer "ei.mi->evaluate" equals NULL. If it does, no specialised eval exists and the normal evaluation is applied. If it is non-null, the specialised eval function pointed to is invoked instead.

This simple statement deals with all specialised evals in one go. For positions that do not have a specialised eval, the cost of this statement is only the check for null. If you remove KBNK, this check must stay. The cost of this check is independent of the number of specialised evals.

The only way to save cpu cycli here for positions without specialised eval is by removing ALL specialised evaluations. Only then can you remove the if statement. You would have to remove all of endgame.cpp, not just KBNK but also KPK etc. That would lose a lot of knowledge and it would gain only a single null pointer check in SF's eval. It seems unlikely to me that that is worth it.

Uri Blass · Post by **Uri Blass** » Fri Feb 14, 2014 3:18 pm

syzygy wrote:
Uri Blass wrote:I am no expert about speed so I say nothing but I wonder what is the difference between it and endgame knowledge and what is the reason that you think that endgame knowledge inside the evaluation like KBN vs K has zero cost
when the program need to check if a position is KBN vs K during the search.
The answer is in the code:
Code: Select all
  // If we have a specialized evaluation function for the current material
  // configuration, call it and return.
  if (ei.mi->specialized_eval_exists())
      return ei.mi->evaluate(pos);
The condition simply checks whether the pointer "ei.mi->evaluate" equals NULL. If it does, no specialised eval exists and the normal evaluation is applied. If it is non-null, the specialised eval function pointed to is invoked instead.

This simple statement deals with all specialised evals in one go. For positions that do not have a specialised eval, the cost of this statement is only the check for null. If you remove KBNK, this check must stay. The cost of this check is independent of the number of specialised evals.

The only way to save cpu cycli here for positions without specialised eval is by removing ALL specialised evaluations. Only then can you remove the if statement. You would have to remove all of endgame.cpp, not just KBNK but also KPK etc. That would lose a lot of knowledge and it would gain only a single null pointer check in SF's eval. It seems unlikely to me that that is worth it.

The question is if the price of computing the pointer "ei.mi->evaluate" that you compute earlier is the same regardless of the number of different cases.

If it is the same I do not understand how it is possible because I expect it to have higher price with bigger number of cases.

From material.cpp

Code: Select all

// Let's look if we have a specialized evaluation function for this particular
  // material configuration. Firstly we look for a fixed configuration one, then
  // for a generic one if the previous search failed.
  if (endgames.probe(key, e->evaluationFunction))
      return e;

The question is if calculating endgames.probe takes the same time even if you have more cases.

I understand that stockfish need to decide based on the key that is
a 64 bit number if there is a special evaluation function.

No price means that
checking if the key is equal to one of n values has the same price as checking if the key is equal to one of n+1 values.

syzygy · Post by **syzygy** » Fri Feb 14, 2014 5:07 pm

Uri Blass wrote:The question is if calculating endgames.probe takes the same time even if you have more cases.

As I wrote, the condition is merely checking whether a pointer is non-null.

See material.h:

Code: Select all

bool specialized_eval_exists() const { return evaluationFunction != NULL; }

This is completely independent of the number of specialised evals.

I understand that stockfish need to decide based on the key that is
a 64 bit number if there is a special evaluation function.

No price means that
checking if the key is equal to one of n values has the same price as checking if the key is equal to one of n+1 values.

If you mean the code that builds the relevant entry of the material hash table, yes that code loops through the specialised endgames. But the hit rate of the material hashtable is probably 99.99%, so those few cycli are irrelevant.

bob · Post by **bob** » Fri Feb 14, 2014 9:08 pm

Uri Blass wrote:
bob wrote:
AlvaroBegue wrote:
syzygy wrote:It makes sense to only measure selective search depth in PV nodes. It saves some cpu cycli and reporting selective search depth is not much more than a gimmick anyway.
It saves cycles? What evidence do you have that it saves cycles?
That's too easy. EVERY compare, jump less than, and mov instruction group takes cycles. If you don't execute those instructions you definitely save the cycles they would have burned. Might not be a lot, but it is certainly greater than zero.

I think that maxply value is something useful for debugging, such as when you go fruit-like and grossly overextend and blow arrays. I've seen fruit go beyond 200 plies where the basic iteration depth is just 7 plies. That is certainly not reasonable, and displaying the number lets the author know that he has broken something and is inviting a non-terminating search. Otherwise, it is useless...
I think that it is useful to get some bound for the ply that you need to search in games(of course if you define it correctly and not the dubious definition that is in the stockfish code) and if the target is to save cycles then it is better to get rid of selective depth.

If the program plays thousands of games and in no game it get selective depth that is bigger than 60 plies then it means that 60 plies are enough for it(the opposite in not correct and of course if it gets 200 plies it does not mean that the last 140 plies are useful for being stronger and testing is needed for it).

Here's the key question: "What does that depth mean?" It could be following a totally futile series of checks. There was a position posted a while back that caused Fruit to crash and Rybka to misbehave. I debugged the fruit crash. At iteration (depth) = 7, it was WAY out beyond 200 plies. Which is complete nonsense. In that position, what would "7/257" mean to you? I see zero useful content myself.

That doesn't tell you anything at all about what was searched how deeply. Could be a hopelessly lost move that just leads to a lot of unnecessary extensions. I have used maxdepth to do just what you mentioned, but that was an author debugging issue (how deep do the search arrays need to be?) as opposed to a user issue that provides some useful info.

BTW just because you play thousands of games and 60 is enough does NOT guarantee you that 60 is enough.

bob · Post by **bob** » Fri Feb 14, 2014 9:09 pm

AlvaroBegue wrote:
bob wrote:
AlvaroBegue wrote:
syzygy wrote:It makes sense to only measure selective search depth in PV nodes. It saves some cpu cycli and reporting selective search depth is not much more than a gimmick anyway.
It saves cycles? What evidence do you have that it saves cycles?
That's too easy. EVERY compare, jump less than, and mov instruction group takes cycles. If you don't execute those instructions you definitely save the cycles they would have burned. Might not be a lot, but it is certainly greater than zero.
No, not really. With instruction-level parallelism it is sometimes possible to execute some instructions for free, if they can be decoded and executed at the same time as some other instruction.

Actually, if PvNode weren't a compile-time constant (which it is, because it can be deduced from template parameters), the proposed code would likely be faster, because the check for `PvNode' would involve a hard-to-predict branch.

So it's possibly true that the original code is faster, but it is not obvious to me. Given how complex modern processors and compilers are, the only way to tell if something is slower is to measure it.

Note: the PV branch will be predicted almost 100% correctly. Hardly ANY nodes in the tree are PV, so it will always say "non-PV".

syzygy · Post by **syzygy** » Sun Feb 16, 2014 5:24 am

bob wrote:Note: the PV branch will be predicted almost 100% correctly. Hardly ANY nodes in the tree are PV, so it will always say "non-PV".

With SF's templates it is evaluated only at compile-time. The non-PV search function will not have the test at all.

lucasart · Post by **lucasart** » Sun Feb 16, 2014 10:32 am

syzygy wrote:It makes sense to only measure selective search depth in PV nodes. It saves some cpu cycli and reporting selective search depth is not much more than a gimmick anyway.

Exactly! "Gimmick" is the word I was looking for. I don't think the selective depth makes any sense nowadays. Regardless of how you define it, it's rather meaningless, in engines with massive reductions. Perhaps in the old days when you have a plain/simple alpha/beta with search extensions (no reductions), printing the selective depth made some sense. But today...?

My advice is that we should simply remove this useless feature. Many engines do not print out a selective depth.

selective depth definition

Re: selective depth definition

Re: selective depth definition

Re: selective depth definition

Re: selective depth definition

Re: selective depth definition

Re: selective depth definition

Re: selective depth definition

Re: selective depth definition

Re: selective depth definition

Re: selective depth definition