back to the Komodo SMP issue

syzygy · Post by **syzygy** » Fri Jul 05, 2013 6:12 pm

duncan wrote:
syzygy wrote:Bob's argument is logically flawed in more than one way.
to avoid what bob will call handwaving perhaps you could spell it out or reference it if you have already done so, so bob can respond.

It has been spelled out quite a few times already and there is no use in Bob responding to it once more (although I know he will do it).

But since you are asking, I will respond for you (not for Bob).

The usual single-threaded search algorithms are inherently serial, which means it cannot be the same as the parallel search algorithm. The parallel search algorithm is inherently less efficient: an N-threaded search on N cores is not as effective as an 1-threaded search on an N times faster single core.

This inefficiency (and I am talking about common sense inefficiency and not about Bob's out-of-context textbook (in)efficiency), although unavoidable, is not fixed. It of course depends on the SMP implementation. It may also depend on the underlying single-threaded search.

Mimicking the single-threaded search as much as possible (so purely going for added depth) may, depending on these factors, be problematic.

It may be difficult or even impossible to take the same reduction/extension decisions as the single search would, so that it may turn out to be cheaper or even necessary to accept a wider search (at the cost of depth, but in return for higher quality at the same depth). If this is the case, then Bob's argument does not come into view: we would have loved to have the same selectivity as the default single threaded search, but it was simply not possible / too hard / too expensive to get right.

It may also be the case that the "SMP inefficiency" just happens to be significantly greater for the serial reduction/extension scheme of a particular engine than it is for a less agressive reduction scheme. Bob has stated multiple times that the speed up values from old papers don't anymore apply to today's highly selective searches. If a slightly less selective search has better SMP efficiency (in Bob's textbook sense) when going parallel than the default single-threaded selective search, then the gain in SMP efficiency may outweigh the loss due to lower selectivity. If this is the case, then Bob's argument clearly does not hold: going wider on a single core would only incur the loss of lower selectivity and not profit from better SMP efficiency, simply because there is no SMP on a single core.

What also must be kept in mind is that the development of a chess engine is full of trade-offs. It is safe to assume that Komodo's current SMP implementation is the best that Don could achieve in the time he had available for it. It might be that further improvements will allow him to go purely for depth (at the cost of an investment in development time). We have no way of knowing this, but one thing should be clear: nobody can ever be criticised for not yet achieving the optimal implementation.

bob · Post by **bob** » Fri Jul 05, 2013 7:05 pm

syzygy wrote:
duncan wrote:
syzygy wrote:Bob's argument is logically flawed in more than one way.
to avoid what bob will call handwaving perhaps you could spell it out or reference it if you have already done so, so bob can respond.
It has been spelled out quite a few times already and there is no use in Bob responding to it once more (although I know he will do it).

But since you are asking, I will respond for you (not for Bob).

The usual single-threaded search algorithms are inherently serial, which means it cannot be the same as the parallel search algorithm. The parallel search algorithm is inherently less efficient: an N-threaded search on N cores is not as effective as an 1-threaded search on an N times faster single core.

This inefficiency (and I am talking about common sense inefficiency and not about Bob's out-of-context textbook (in)efficiency), although unavoidable, is not fixed. It of course depends on the SMP implementation. It may also depend on the underlying single-threaded search.

You obviously don't understand parallel search. There is no "out-of-context" inefficiency here. There is only ONE place in a modern engine where the parallel search loses efficiency. Only ONE. Extra nodes searched due to inaccurate split point detection. Nothing else. And that overhead (extra nodes) is the ONLY thing that prevents 4x speedup using 4 cores. There's no waiting or synchronizing overhead to speak of, certainly nothing rising to even 1% of total overhead.

So how about sticking to factual information for a change?

Mimicking the single-threaded search as much as possible (so purely going for added depth) may, depending on these factors, be problematic.

It may be difficult or even impossible to take the same reduction/extension decisions as the single search would, so that it may turn out to be cheaper or even necessary to accept a wider search (at the cost of depth, but in return for higher quality at the same depth). If this is the case, then Bob's argument does not come into view: we would have loved to have the same selectivity as the default single threaded search, but it was simply not possible / too hard / too expensive to get right.

It may also be the case that the "SMP inefficiency" just happens to be significantly greater for the serial reduction/extension scheme of a particular engine than it is for a less agressive reduction scheme. Bob has stated multiple times that the speed up values from old papers don't anymore apply to today's highly selective searches. If a slightly less selective search has better SMP efficiency (in Bob's textbook sense) when going parallel than the default single-threaded selective search, then the gain in SMP efficiency may outweigh the loss due to lower selectivity. If this is the case, then Bob's argument clearly does not hold: going wider on a single core would only incur the loss of lower selectivity and not profit from better SMP efficiency, simply because there is no SMP on a single core.

What is with all the "ifs"??? One can absolutely do the same reductions and pruning in a SMP search that they do in a serial search. Such a statement is simply ridiculous on the surface, much more so if you go deeper into the concept.

I've asked for technical information. You again produce hand-waving. If you go wider, you lose depth. Period. The only way to recover that depth is through a traditional parallel search. If the implementation only produces a 1.4x speedup at 4 cores, you are NOT going to regain what you lose from taking the EBF from 2.0 to 2.1 (numbers others mentioned).

So, again, no hand-waving. Explain EXPLICITLY how this is going to be done, giving a little math to go with the hand-waving, so that it can be analyzed and critiqued. At present, you offer ZERO details, just "this must be so."

It is quite clear that a speedup of > 3.0 with 4 cores is easy enough to produce. No one cares about 128 cores at the moment. Such machines are not available to 99.999% of the users due to cost. But 4 / 8 core boxes are readily available and >3x on 4 cores or >6x on 8 cores is pretty common.

All I have claimed, regarding older programs, was that it was somewhat easier to get closer to optimal because the trees were more uniform. But they were far from perfectly uniform. I have been tweaking reduction and pruning rules for a while, and none of those changes influences the SMP speedup in any significant way. So I do not believe that is an issue worth discussing. Houdini seems to be highly selective, yet it speeds up pretty normally from what I have seen. I know my program does. So again, a pointless debate about an unimportant detail.

What also must be kept in mind is that the development of a chess engine is full of trade-offs. It is safe to assume that Komodo's current SMP implementation is the best that Don could achieve in the time he had available for it. It might be that further improvements will allow him to go purely for depth (at the cost of an investment in development time). We have no way of knowing this, but one thing should be clear: nobody can ever be criticised for not yet achieving the optimal implementation.

Nobody (at least not me) is criticizing ANYTHING here other than the vague, hand-waving arguments about what is going on and why. The ONLY viable place one can use SMP hardware is to parallel search the tree. Parallel move generation and such simply are not nearly as effective. One can change the shape of the tree all they want, but the SMP search still has to produce a speedup or it is worthless.

If you want to claim "OK, Komodo searches a bigger tree so that the time to depth is only 1.4x, but if he had gone purely for depth then he would see >3x, then fine. Now you have to explain why the bigger tree is better than the extra 1.5 plies the >3x would produce. As now you are factually trading depth for more width. And that tradeoff would be just as effective for the serial search.

Just define the parameters, and this can be mathematically examined, without all the "ifs and buts".

syzygy · Post by **syzygy** » Sat Jul 06, 2013 2:31 am

syzygy wrote:
duncan wrote:
syzygy wrote:Bob's argument is logically flawed in more than one way.
to avoid what bob will call handwaving perhaps you could spell it out or reference it if you have already done so, so bob can respond.
It has been spelled out quite a few times already and there is no use in Bob responding to it once more (although I know he will do it).

But since you are asking, I will respond for you (not for Bob).

So if you or anyone not Bob has questions, feel free to ask. I can't really take someone serious that starts with "you obviously don't understand parallel search". Just a waste of my time.

bob · Post by **bob** » Sat Jul 06, 2013 3:43 am

syzygy wrote:
syzygy wrote:
duncan wrote:
syzygy wrote:Bob's argument is logically flawed in more than one way.
to avoid what bob will call handwaving perhaps you could spell it out or reference it if you have already done so, so bob can respond.
It has been spelled out quite a few times already and there is no use in Bob responding to it once more (although I know he will do it).

But since you are asking, I will respond for you (not for Bob).
So if you or anyone not Bob has questions, feel free to ask. I can't really take someone serious that starts with "you obviously don't understand parallel search". Just a waste of my time.

And, by the same token, since you have NOT shown any understanding of parallel search issues at all, you are offering absolutely ZERO useful information to anyone here. Why not let the ones that actually understand, and work with this stuff on a daily basis, do the responding???

BTW, NOTHING has been "spelled out" in any technical way at all. Absolutely nothing. Just an innuendo, or a "might be" or "could be"...

mvk · Post by **mvk** » Sat Jul 06, 2013 12:49 pm

bob wrote:And, by the same token, since you have NOT shown any understanding of parallel search issues at all, you are offering absolutely ZERO useful information to anyone here. Why not let the ones that actually understand, and work with this stuff on a daily basis, do the responding???

BTW, NOTHING has been "spelled out" in any technical way at all. Absolutely nothing. Just an innuendo, or a "might be" or "could be"...

This is demonstrably false, as Sjaak is a stronger SMP program than Crafty.
And don't forget that just because you don't understand something doesn't mean it isn't useful to others.

bob · Post by **bob** » Sat Jul 06, 2013 7:07 pm

mvk wrote:
bob wrote:And, by the same token, since you have NOT shown any understanding of parallel search issues at all, you are offering absolutely ZERO useful information to anyone here. Why not let the ones that actually understand, and work with this stuff on a daily basis, do the responding???

BTW, NOTHING has been "spelled out" in any technical way at all. Absolutely nothing. Just an innuendo, or a "might be" or "could be"...
This is demonstrably false, as Sjaak is a stronger SMP program than Crafty.
And don't forget that just because you don't understand something doesn't mean it isn't useful to others.

What is demonstrably false? We are discussing Komodo's parallel search.

What does "stronger SMP implementation" mean? I doubt it gets a better speedup. So what are you talking about?

mvk · Post by **mvk** » Sat Jul 06, 2013 10:30 pm

bob wrote:
mvk wrote:
bob wrote:And, by the same token, since you have NOT shown any understanding of parallel search issues at all, you are offering absolutely ZERO useful information to anyone here. Why not let the ones that actually understand, and work with this stuff on a daily basis, do the responding???

BTW, NOTHING has been "spelled out" in any technical way at all. Absolutely nothing. Just an innuendo, or a "might be" or "could be"...
This is demonstrably false, as Sjaak is a stronger SMP program than Crafty.
And don't forget that just because you don't understand something doesn't mean it isn't useful to others.
What is demonstrably false? We are discussing Komodo's parallel search.

What does "stronger SMP implementation" mean? I doubt it gets a better speedup. So what are you talking about?

Here you go trolling again... If you want to pick a fight then bully someone else with your ways as I'm not interested in any of that.

bob · Post by **bob** » Sat Jul 06, 2013 10:40 pm

mvk wrote:
bob wrote:
mvk wrote:
bob wrote:And, by the same token, since you have NOT shown any understanding of parallel search issues at all, you are offering absolutely ZERO useful information to anyone here. Why not let the ones that actually understand, and work with this stuff on a daily basis, do the responding???

BTW, NOTHING has been "spelled out" in any technical way at all. Absolutely nothing. Just an innuendo, or a "might be" or "could be"...
This is demonstrably false, as Sjaak is a stronger SMP program than Crafty.
And don't forget that just because you don't understand something doesn't mean it isn't useful to others.
What is demonstrably false? We are discussing Komodo's parallel search.

What does "stronger SMP implementation" mean? I doubt it gets a better speedup. So what are you talking about?
Here you go trolling again... If you want to pick a fight then bully someone else with your ways as I'm not interested in any of that.

Trolling? I asked specific, polite questions. Your terminology left a lot to be desired and your meaning was not clear. If you find that insulting, so be it. It was not the intent. I'd just like to see people write things that are clear and which don't require follow-up questions to clear up vague terminology...

For example, immediately following my statement about Komodo's search, you wrote "This is demonstrably false." What is "this"? Logically, it refers to the statement immediately preceding this. Yet there have been NO technical details discussing how someone might use part of the parallel processing power to search wider and how that would be better than gaining 1.5 plies or so by going deeper. So, what does "this" refer to?

What does "stronger SMP program" mean? Better SMP algorithm (which I doubt) or just a stronger program, period, which is irrelevant to this discussion.

Understand my questions now?

h1a8 · Post by **h1a8** » Tue Jul 09, 2013 3:57 am

bob wrote:
mvk wrote:
bob wrote:
mvk wrote:
bob wrote:And, by the same token, since you have NOT shown any understanding of parallel search issues at all, you are offering absolutely ZERO useful information to anyone here. Why not let the ones that actually understand, and work with this stuff on a daily basis, do the responding???

BTW, NOTHING has been "spelled out" in any technical way at all. Absolutely nothing. Just an innuendo, or a "might be" or "could be"...
This is demonstrably false, as Sjaak is a stronger SMP program than Crafty.
And don't forget that just because you don't understand something doesn't mean it isn't useful to others.
What is demonstrably false? We are discussing Komodo's parallel search.

What does "stronger SMP implementation" mean? I doubt it gets a better speedup. So what are you talking about?
Here you go trolling again... If you want to pick a fight then bully someone else with your ways as I'm not interested in any of that.
Trolling? I asked specific, polite questions. Your terminology left a lot to be desired and your meaning was not clear. If you find that insulting, so be it. It was not the intent. I'd just like to see people write things that are clear and which don't require follow-up questions to clear up vague terminology...

For example, immediately following my statement about Komodo's search, you wrote "This is demonstrably false." What is "this"? Logically, it refers to the statement immediately preceding this. Yet there have been NO technical details discussing how someone might use part of the parallel processing power to search wider and how that would be better than gaining 1.5 plies or so by going deeper. So, what does "this" refer to?

What does "stronger SMP program" mean? Better SMP algorithm (which I doubt) or just a stronger program, period, which is irrelevant to this discussion.

Understand my questions now?

Robert I see where you are coming from and you make fine points that I agree with. But it is not of good taste to say comments like "you obviously don't understand parallel search" to someone who has shown otherwise by utilizing it to a halfway decent degree in their own engine.
Even if you think their understanding is low (it's impossible to be non existent) then there are better ways to comment.

Think about it!

I'm just trying to mend things here as I like to read the technical discussions sometimes.

bob · Post by **bob** » Tue Jul 09, 2013 7:08 am

h1a8 wrote:
bob wrote:
mvk wrote:
bob wrote:
mvk wrote:
bob wrote:And, by the same token, since you have NOT shown any understanding of parallel search issues at all, you are offering absolutely ZERO useful information to anyone here. Why not let the ones that actually understand, and work with this stuff on a daily basis, do the responding???

BTW, NOTHING has been "spelled out" in any technical way at all. Absolutely nothing. Just an innuendo, or a "might be" or "could be"...
This is demonstrably false, as Sjaak is a stronger SMP program than Crafty.
And don't forget that just because you don't understand something doesn't mean it isn't useful to others.
What is demonstrably false? We are discussing Komodo's parallel search.

What does "stronger SMP implementation" mean? I doubt it gets a better speedup. So what are you talking about?
Here you go trolling again... If you want to pick a fight then bully someone else with your ways as I'm not interested in any of that.
Trolling? I asked specific, polite questions. Your terminology left a lot to be desired and your meaning was not clear. If you find that insulting, so be it. It was not the intent. I'd just like to see people write things that are clear and which don't require follow-up questions to clear up vague terminology...

For example, immediately following my statement about Komodo's search, you wrote "This is demonstrably false." What is "this"? Logically, it refers to the statement immediately preceding this. Yet there have been NO technical details discussing how someone might use part of the parallel processing power to search wider and how that would be better than gaining 1.5 plies or so by going deeper. So, what does "this" refer to?

What does "stronger SMP program" mean? Better SMP algorithm (which I doubt) or just a stronger program, period, which is irrelevant to this discussion.

Understand my questions now?
Robert I see where you are coming from and you make fine points that I agree with. But it is not of good taste to say comments like "you obviously don't understand parallel search" to someone who has shown otherwise by utilizing it to a halfway decent degree in their own engine.
Even if you think their understanding is low (it's impossible to be non existent) then there are better ways to comment.

Think about it!

I'm just trying to mend things here as I like to read the technical discussions sometimes.

I'd like to see this discussion turn "technical". That's why I started it. But a couple only want to argue and not discuss anything technical at all...

back to the Komodo SMP issue

Re: back to the Komodo SMP issue

Re: back to the Komodo SMP issue

Re: back to the Komodo SMP issue

Re: back to the Komodo SMP issue

Re: back to the Komodo SMP issue

Re: back to the Komodo SMP issue

Re: back to the Komodo SMP issue

Re: back to the Komodo SMP issue

Re: back to the Komodo SMP issue

Re: back to the Komodo SMP issue