How so. If it is unintentional one works to reduce/eliminate it. If it is intentional, one does not.michiguel wrote:It has never been discussed, so it is not a legend. By the way, it is irrelevant whether it is intentional or not.bob wrote:No but intentional widening rather than searching deeper is certainly a legend.michiguel wrote:Do you realize your logic is totally flawed? it is a gigantic straw man. You bring something unrelated, say that they are, just to bring the other down.bob wrote:super-linear and this "widening" are members of the same set, "urban legend".syzygy wrote:For the Nth time, nobody is claiming a super-linear speedup. Your whole argument is a non-starter to begin with.bob wrote:if that kind of super-linear speedup happens consistently.
And I am sure that Komodo is not a urban legend.
Miguel
Miguel
Threads test incl. Stockfish 5 and Komodo 8
Moderators: hgm, Rebel, chrisw
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: Threads test incl. Stockfish 5 and Komodo 8
-
- Posts: 6401
- Joined: Thu Mar 09, 2006 8:30 pm
- Location: Chicago, Illinois, USA
Re: Threads test incl. Stockfish 5 and Komodo 8
No, if it is a side effect that helps, you do not want to remove it.bob wrote:How so. If it is unintentional one works to reduce/eliminate it. If it is intentional, one does not.michiguel wrote:It has never been discussed, so it is not a legend. By the way, it is irrelevant whether it is intentional or not.bob wrote:No but intentional widening rather than searching deeper is certainly a legend.michiguel wrote:Do you realize your logic is totally flawed? it is a gigantic straw man. You bring something unrelated, say that they are, just to bring the other down.bob wrote:super-linear and this "widening" are members of the same set, "urban legend".syzygy wrote:For the Nth time, nobody is claiming a super-linear speedup. Your whole argument is a non-starter to begin with.bob wrote:if that kind of super-linear speedup happens consistently.
And I am sure that Komodo is not a urban legend.
Miguel
Miguel
Miguel
-
- Posts: 6401
- Joined: Thu Mar 09, 2006 8:30 pm
- Location: Chicago, Illinois, USA
Re: Threads test incl. Stockfish 5 and Komodo 8
An image in a decent size (I could not edit it after 15 min, arghh).
The equation is
speed up = n / (f * (n^2 - n) + 1)
log2 (W/L) = k * log2 (Speedup)
k is a proportionality constant and f is a factor that mean the "apparent" amount of extra work (non parallelizable) that each extra cpu introduce.
For
K8, f = 0.0011 (0.11%) k = 1.82
K7, f = 0.0016 (0.16%) k = 1.62
SF, f = 0.0087 (0.87%) k = 2.09 (starts great)
Za, f = 0.0100 (1.00%) k = 1.84
Cr, f = 0.0160 (1.60%) k = 1.82
Miguel
Miguel
The equation is
speed up = n / (f * (n^2 - n) + 1)
log2 (W/L) = k * log2 (Speedup)
k is a proportionality constant and f is a factor that mean the "apparent" amount of extra work (non parallelizable) that each extra cpu introduce.
For
K8, f = 0.0011 (0.11%) k = 1.82
K7, f = 0.0016 (0.16%) k = 1.62
SF, f = 0.0087 (0.87%) k = 2.09 (starts great)
Za, f = 0.0100 (1.00%) k = 1.84
Cr, f = 0.0160 (1.60%) k = 1.82
Miguel
Miguel
-
- Posts: 5566
- Joined: Tue Feb 28, 2012 11:56 pm
Re: Threads test incl. Stockfish 5 and Komodo 8
With effort you can make Crafty beat Stockfish...bob wrote:But you CAN reduce the total with effort, and you can NOT reduce just the ones that are no good.syzygy wrote:Note that if some of it is helping, we're already done. That some part of the extra nodes that are helping is referred to as "widening".bob wrote:Explain HOW you are going to pull that off. You are already searching 2x the minimal tree for this position. Most of which is not helping you at all.
This is the thing, nothing more, nothing less.
Widening == not all "extra" nodes are worthless.
Nope. Firstly, you cannot duplicate the 4-core tree with a single threaded search without time penalty. Secondly, even if you could, it would only help in case of super-linear speedup which nobody is claiming.And IF those extra nodes help, search 'em in the regular search also, they will help there too.
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: Threads test incl. Stockfish 5 and Komodo 8
OK, how about stopping to think about this for a minute. You KNOW that not all extra nodes are "helpful". You have NO CONTROL over where those extra nodes show up in the tree. Why on earth would you choose to do that? There is some percentage of those extra nodes that can help by including an important move that normal selectivity would exclude. So what percentage of those overhead nodes are good? Even if it were 50%, which is beyond unlikely, you are wasting half of that time. I'd be more likely to believe 10% help, 90% do not. Isn't there a BETTER way to include that 10% and exclude the 90% that are just raw overhead? Of course there must be.michiguel wrote:No, if it is a side effect that helps, you do not want to remove it.bob wrote:How so. If it is unintentional one works to reduce/eliminate it. If it is intentional, one does not.michiguel wrote:It has never been discussed, so it is not a legend. By the way, it is irrelevant whether it is intentional or not.bob wrote:No but intentional widening rather than searching deeper is certainly a legend.michiguel wrote:Do you realize your logic is totally flawed? it is a gigantic straw man. You bring something unrelated, say that they are, just to bring the other down.bob wrote:super-linear and this "widening" are members of the same set, "urban legend".syzygy wrote:For the Nth time, nobody is claiming a super-linear speedup. Your whole argument is a non-starter to begin with.bob wrote:if that kind of super-linear speedup happens consistently.
And I am sure that Komodo is not a urban legend.
Miguel
Miguel
Miguel
I don't go along with this stuff. Before one widens the search intentionally, one has to give up on improving the parallel search by reducing the overhead. So you have overhead that is not helping very much since you supposedly want to intentionally "widen" the search to help offset selectivity errors. And for each of those nodes you add, you get even more overhead nodes due to the parallel search. That is not a reasonable way to do things. In fact, it is the antithesis of alpha/beta and selectivity. You exclude what is unimportant, only to add things back in that are unimportant. And if that is a good idea in a parallel search, why is it not a good idea in the basic serial search?
NOBODY has addressed that issue and just relies on handwaving and "but I have seen this and shows that the widening happens." But just because it happens does NOT mean (a) it was intentional and (b) it was the best use of computational resources.
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: Threads test incl. Stockfish 5 and Komodo 8
Come back to the discussion when you have some basic idea about what is being discussed. As of right now, you are arguing without a clue about what is happening inside a tree.syzygy wrote:With effort you can make Crafty beat Stockfish...bob wrote:But you CAN reduce the total with effort, and you can NOT reduce just the ones that are no good.syzygy wrote:Note that if some of it is helping, we're already done. That some part of the extra nodes that are helping is referred to as "widening".bob wrote:Explain HOW you are going to pull that off. You are already searching 2x the minimal tree for this position. Most of which is not helping you at all.
This is the thing, nothing more, nothing less.
Widening == not all "extra" nodes are worthless.
Nope. Firstly, you cannot duplicate the 4-core tree with a single threaded search without time penalty. Secondly, even if you could, it would only help in case of super-linear speedup which nobody is claiming.And IF those extra nodes help, search 'em in the regular search also, they will help there too.
-
- Posts: 3226
- Joined: Wed May 06, 2009 10:31 pm
- Location: Fuquay-Varina, North Carolina
Re: Threads test incl. Stockfish 5 and Komodo 8
Best use of computational resources? Perhaps not. However, by 4 cores Komodo 8 is more effectively using the computer resources than Zappa and Crafty, and more effectively than Stockfish by 8 cores.bob wrote:OK, how about stopping to think about this for a minute. You KNOW that not all extra nodes are "helpful". You have NO CONTROL over where those extra nodes show up in the tree. Why on earth would you choose to do that? There is some percentage of those extra nodes that can help by including an important move that normal selectivity would exclude. So what percentage of those overhead nodes are good? Even if it were 50%, which is beyond unlikely, you are wasting half of that time. I'd be more likely to believe 10% help, 90% do not. Isn't there a BETTER way to include that 10% and exclude the 90% that are just raw overhead? Of course there must be.michiguel wrote:No, if it is a side effect that helps, you do not want to remove it.bob wrote:How so. If it is unintentional one works to reduce/eliminate it. If it is intentional, one does not.michiguel wrote:It has never been discussed, so it is not a legend. By the way, it is irrelevant whether it is intentional or not.bob wrote:No but intentional widening rather than searching deeper is certainly a legend.michiguel wrote:Do you realize your logic is totally flawed? it is a gigantic straw man. You bring something unrelated, say that they are, just to bring the other down.bob wrote:super-linear and this "widening" are members of the same set, "urban legend".syzygy wrote:For the Nth time, nobody is claiming a super-linear speedup. Your whole argument is a non-starter to begin with.bob wrote:if that kind of super-linear speedup happens consistently.
And I am sure that Komodo is not a urban legend.
Miguel
Miguel
Miguel
I don't go along with this stuff. Before one widens the search intentionally, one has to give up on improving the parallel search by reducing the overhead. So you have overhead that is not helping very much since you supposedly want to intentionally "widen" the search to help offset selectivity errors. And for each of those nodes you add, you get even more overhead nodes due to the parallel search. That is not a reasonable way to do things. In fact, it is the antithesis of alpha/beta and selectivity. You exclude what is unimportant, only to add things back in that are unimportant. And if that is a good idea in a parallel search, why is it not a good idea in the basic serial search?
NOBODY has addressed that issue and just relies on handwaving and "but I have seen this and shows that the widening happens." But just because it happens does NOT mean (a) it was intentional and (b) it was the best use of computational resources.
-
- Posts: 10948
- Joined: Wed Jul 26, 2006 10:21 pm
- Full name: Kai Laskos
Re: Threads test incl. Stockfish 5 and Komodo 8
Very interesting. Can you hint how you derived f?michiguel wrote:An image in a decent size (I could not edit it after 15 min, arghh).
The equation is
speed up = n / (f * (n^2 - n) + 1)
log2 (W/L) = k * log2 (Speedup)
k is a proportionality constant and f is a factor that mean the "apparent" amount of extra work (non parallelizable) that each extra cpu introduce.
For
K8, f = 0.0011 (0.11%) k = 1.82
K7, f = 0.0016 (0.16%) k = 1.62
SF, f = 0.0087 (0.87%) k = 2.09 (starts great)
Za, f = 0.0100 (1.00%) k = 1.84
Cr, f = 0.0160 (1.60%) k = 1.82
Miguel
You assumed an Amdahl-like behavior:
S = (R + 1) * n / (R + n) like in this model?
http://www.talkchess.com/forum/viewtopi ... 29&t=48780
About Wilos, I speculated that Win/Loss ratio remains pretty constant beyond ultra-fast controls several years ago on another forum. More than one year ago, I speculated that here too:
http://www.talkchess.com/forum/viewtopi ... 27&t=48733
With some data. Inversing with draw rate and Elo gain, I even saw a very slow increase with time control of Win/Loss ratio, but almost flat. Wilos will get rid of all these strength, time control, hardware, etc. issues present in Elos. From your graphs and interpolation, it seems that Zappa and Crafty apparent good Elo gain to 8 cores occurs just because they are weaker, and have less draws. In Wilos, their gain is worse than that of top engines, and to 16 cores Komodo 8 is standing out.
-
- Posts: 10948
- Joined: Wed Jul 26, 2006 10:21 pm
- Full name: Kai Laskos
Re: Threads test incl. Stockfish 5 and Komodo 8
You seem to not know how a single core tree grows. Example is that 3 year old threadbob wrote:Come back to the discussion when you have some basic idea about what is being discussed. As of right now, you are arguing without a clue about what is happening inside a tree.syzygy wrote:With effort you can make Crafty beat Stockfish...bob wrote:But you CAN reduce the total with effort, and you can NOT reduce just the ones that are no good.syzygy wrote:Note that if some of it is helping, we're already done. That some part of the extra nodes that are helping is referred to as "widening".bob wrote:Explain HOW you are going to pull that off. You are already searching 2x the minimal tree for this position. Most of which is not helping you at all.
This is the thing, nothing more, nothing less.
Widening == not all "extra" nodes are worthless.
Nope. Firstly, you cannot duplicate the 4-core tree with a single threaded search without time penalty. Secondly, even if you could, it would only help in case of super-linear speedup which nobody is claiming.And IF those extra nodes help, search 'em in the regular search also, they will help there too.
http://www.talkchess.com/forum/viewtopi ... 83&t=38808
The pictures are gone, but it still can be understood, with Marco Costabla surely understanding it:
With Zach Wegner having some plot of the tree behavior.mcostalba wrote:Laskos wrote:
Tree shape is a completely different matter, and as you can see, it is widening going to 5,10,15,20,25 _target_ depth. I can clearly separate EBF into two components: EBF_widening, which is ~1.4, and EBF_deepening, which is ~1.6, for a total EBF of 1.4*1.6 ~ 2.2. As I already wrote, going from target N to target N+1 requires a comparable effort in both deepening and widening.
This is very interesting, if I have understood correctly you say that the extra effort that we need to move the search at depth N+1 is due for a good 45% (1,4 vs 1,6) by additional search at the inner nodes and only for 55% by an additional search at the new deeper leaf nodes.
You Bob missed the whole point, IIRC, and you have no idea how the tree grows even in single-cored Crafty iteration after iteration.Zach Wegner wrote:I did plot some Stockfish trees and, as expected since it uses logarithmic reductions, the trees have a huge amount of widening. In fact, the widening accounts for far more of the tree growth than deepening.Laskos wrote:
Try to talk that to Bob, he doesn't even know how to read a plot, where the slopes (first derivatives) are giving EBF(target depth, ply) on the tree, not the absolute values, and those slopes are both larger and smaller for different plies than the general EBF(target depth), giving a total pretty constant EBF(target depth) i.e. from target depth N to N+1 (different trees). I am really eager to see someone plot the tree shape of Stockfish for different target depths, there is no way it's not widening compared to the purely exponential, his "widening", which is not a widening at all (he seems to be confused about what widening of the tree is). Bob's theoretical tree shape is the purely exponential, non-widening W^D, how silly is that? I don't think he knows the tree shape even of his Crafty.
Kai
_________________
-
- Posts: 6401
- Joined: Thu Mar 09, 2006 8:30 pm
- Location: Chicago, Illinois, USA
Re: Threads test incl. Stockfish 5 and Komodo 8
That was a good observation Kai. We were brainstorming about how to introduce draws and deal with the fact that draw rates change with increasing strength. If you look at the physical chemical "model" of Ordo, it becomes obvious that TRUE strength is related to the ratio W/L (that is related to the difference in "energetic" levels between W and L.Laskos wrote:Very interesting. Can you hint how you derived f?michiguel wrote:An image in a decent size (I could not edit it after 15 min, arghh).
The equation is
speed up = n / (f * (n^2 - n) + 1)
log2 (W/L) = k * log2 (Speedup)
k is a proportionality constant and f is a factor that mean the "apparent" amount of extra work (non parallelizable) that each extra cpu introduce.
For
K8, f = 0.0011 (0.11%) k = 1.82
K7, f = 0.0016 (0.16%) k = 1.62
SF, f = 0.0087 (0.87%) k = 2.09 (starts great)
Za, f = 0.0100 (1.00%) k = 1.84
Cr, f = 0.0160 (1.60%) k = 1.82
Miguel
You assumed an Amdahl-like behavior:
S = (R + 1) * n / (R + n) like in this model?
http://www.talkchess.com/forum/viewtopi ... 29&t=48780
About Wilos, I speculated that Win/Loss ratio remains pretty constant beyond ultra-fast controls several years ago on another forum. More than one year ago, I speculated that here too:
http://www.talkchess.com/forum/viewtopi ... 27&t=48733
With some data. Inversing with draw rate and Elo gain, I even saw a very slow increase with time control of Win/Loss ratio, but almost flat. Wilos will get rid of all these strength, time control, hardware, etc. issues present in Elos. From your graphs and interpolation, it seems that Zappa and Crafty apparent good Elo gain to 8 cores occurs just because they are weaker, and have less draws. In Wilos, their gain is worse than that of top engines, and to 16 cores Komodo 8 is standing out.
As you point out, Z and C seem to have a better scaling at the beginning because they are weaker than SF and K. Exactly.
One thing that Andreas could do is to test Engine_X vs Engine_X (with twice the time, not twice the cores). This will be a theoretical 100% speed up for two cores. If we get the ratio we will get an "effective" speed up.
Sorry, I was going give the derivation but I got trapped with work
Code: Select all
Tipical Amdahl can be shown as
S + P
SU = ---------
S + P/n
Numerator is the work with one thread, denominator is the work with "n" threads
SU = speed up
S = serial fraction of the work that cannot be parallelized
P = work that can be parallelized
n = threads
Of course, S + P = 1
But, Amdahl cannot explain that the speed up may go down at a certain number of threads.
Then, I assumed that the work with n threads starts to have spurious work added for each thread added, and that it is a work that cannot be parallelized. That is, for each thread added, every thread slows down a bit in areas doing nothing. So, I add a term f*(n-1) in the denominator and place 1 in the numerator since S+P=1.
Code: Select all
1
SU = -------------------
S + P/n + f (n-1)
P = 1 and the term S disappears:
Code: Select all
1
SU = -------------------
1/n + f (n-1)
Code: Select all
n
SU = -------------------
1 + f (n^2-n)