This is a spin off of another thread.
I never gave a lot thought about it but the well know formula for Crafty
speedup = 1 + (NCPUS - 1) * 0.7
may indicate that the inefficiency is not related to the Amdahl's law, even if this applies to a low number of CPUs. What is the cause of the parallel inefficiency? The shape of the tree? still, it looks like it should either saturate quicker or the speed up with 2 cores should be higher than 1.7
Was this investigated?
Miguel
SMP speed up
Moderators: hgm, Rebel, chrisw
-
- Posts: 6401
- Joined: Thu Mar 09, 2006 8:30 pm
- Location: Chicago, Illinois, USA
-
- Posts: 900
- Joined: Tue Apr 27, 2010 3:48 pm
Re: SMP speed up
I think it pays off to simplify the formula for understanding:
1+ (NCPUS - 1) * 0.7 =
0.7 * NCPUS + 0.3
What strikes me as surprising about it is that converges very quickly to 70% efficiency, which it will never go under of course. It implies that going from 1 CPU to 2 CPUs results in much more added overhead than going from 2 to 4 CPUs, and so on.
I think it's a strange formula, but if it has been correctly measured that way, what can I say... (though Robert did say he only measured up to 64 cores)
1+ (NCPUS - 1) * 0.7 =
0.7 * NCPUS + 0.3
What strikes me as surprising about it is that converges very quickly to 70% efficiency, which it will never go under of course. It implies that going from 1 CPU to 2 CPUs results in much more added overhead than going from 2 to 4 CPUs, and so on.
I think it's a strange formula, but if it has been correctly measured that way, what can I say... (though Robert did say he only measured up to 64 cores)
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: SMP speed up
I beat it to death for 1-8 cores. About all that can be said is that after the first CPU, every processor added makes the tree grow. If you run the old CB positions, which are all the positions from a single real game that everyone has seen, it is pretty consistent. There is lots of variation until you average over all the moves, and then that 30% extra nodes per CPU starts to settle down. You could drive this down by improving move ordering to get a higher percentage of fail highs on the first move. But I have been stuck at 90-92% for years, which pretty well fixes the overhead since one of every 10 splits (when I split right after the first move) is going to be at a bad point that adds overhead...michiguel wrote:This is a spin off of another thread.
I never gave a lot thought about it but the well know formula for Crafty
speedup = 1 + (NCPUS - 1) * 0.7
may indicate that the inefficiency is not related to the Amdahl's law, even if this applies to a low number of CPUs. What is the cause of the parallel inefficiency? The shape of the tree? still, it looks like it should either saturate quicker or the speed up with 2 cores should be higher than 1.7
Was this investigated?
Miguel
There is likely a mathematical model that considers fh % as computed in Crafty, and predicts the speedup, but I have never tried to quantify that at all since it would not be of any benefit. We all know that YBW depends on the first move being the one to cause a cutoff, else we think it is an ALL node.
In my dissertation I tackled this by searching perfectly ordered trees and got perfect linearity in the speedup. But I faked the eval to produce a monotonically decreasing value to simulate perfect ordering. I also tackled worst-first, which is pure minimax, and also got perfect speedups. But it was slow with no cutoffs at all. It is the very good trees that cause the problem.
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: SMP speed up
how do you figure that?rbarreira wrote:I think it pays off to simplify the formula for understanding:
1+ (NCPUS - 1) * 0.7 =
0.7 * NCPUS + 0.3
What strikes me as surprising about it is that converges very quickly to 70% efficiency, which it will never go under of course. It implies that going from 1 CPU to 2 CPUs results in much more added overhead than going from 2 to 4 CPUs, and so on.
[/quote]
There are lots of things at play. With 2 cpus, only 1 is doing unnecessary work at a split point that was poorly chosen. With 4, that goes up to 3. So although it might look like the overhead is going down, it really is not.
And remember, with 64 cores, you you 7 data points that don't lie on a perfectly straight line. I simply chose a good approximation. Originally that formula worked for 1-2-4. then we added 8 and it still fit well. And then 16 and 32. Eugene ran it on a 64 core Itanium which was possibly a tainted result since it was a different architecture, but the 1-2-4-8-16-32-64 numbers still stayed around that line. It is not a perfect fit. But it is a good 1st approximation, which is all I have ever called it...
I think it's a strange formula, but if it has been correctly measured that way, what can I say... (though Robert did say he only measured up to 64 cores)
-
- Posts: 4190
- Joined: Wed Nov 25, 2009 1:47 am
Re: SMP speed up
Just a reply from the previous thread
"More on Bob's formula...
speedup_2=1+0.7=1.7
speedup_32=1+31*0.7=22.7
speedup_64=1+63*0.7=45.1
speedup_64/speedup32=1.99!!!
speedup_2=1.7
17% more gain when going from 32 to 64 than from 1 to 2 cores."
I let the ppl make their own conclusions .
"More on Bob's formula...
speedup_2=1+0.7=1.7
speedup_32=1+31*0.7=22.7
speedup_64=1+63*0.7=45.1
speedup_64/speedup32=1.99!!!
speedup_2=1.7
17% more gain when going from 32 to 64 than from 1 to 2 cores."
I let the ppl make their own conclusions .
-
- Posts: 4190
- Joined: Wed Nov 25, 2009 1:47 am
Re: SMP speed up
I'll repeat. I hope you didn't published a paper with this kind of results, because this would be just a farce...bob wrote:And remember, with 64 cores, you you 7 data points that don't lie on a perfectly straight line. I simply chose a good approximation. Originally that formula worked for 1-2-4. then we added 8 and it still fit well. And then 16 and 32. Eugene ran it on a 64 core Itanium which was possibly a tainted result since it was a different architecture, but the 1-2-4-8-16-32-64 numbers still stayed around that line. It is not a perfect fit. But it is a good 1st approximation, which is all I have ever called it...
-
- Posts: 13447
- Joined: Wed Mar 08, 2006 9:02 pm
- Location: Dallas, Texas
- Full name: Matthew Hull
Re: SMP speed up
And your tests confirm Bob is wrong?Milos wrote:Just a reply from the previous thread
"More on Bob's formula...
speedup_2=1+0.7=1.7
speedup_32=1+31*0.7=22.7
speedup_64=1+63*0.7=45.1
speedup_64/speedup32=1.99!!!
speedup_2=1.7
17% more gain when going from 32 to 64 than from 1 to 2 cores."
I let the ppl make their own conclusions .
Matthew Hull
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: SMP speed up
I wish you would offer some sort of supporting evidence for your arguments, otherwise you just look foolish.Milos wrote:I'll repeat. I hope you didn't published a paper with this kind of results, because this would be just a farce...bob wrote:And remember, with 64 cores, you you 7 data points that don't lie on a perfectly straight line. I simply chose a good approximation. Originally that formula worked for 1-2-4. then we added 8 and it still fit well. And then 16 and 32. Eugene ran it on a 64 core Itanium which was possibly a tainted result since it was a different architecture, but the 1-2-4-8-16-32-64 numbers still stayed around that line. It is not a perfect fit. But it is a good 1st approximation, which is all I have ever called it...
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: SMP speed up
No, his tests confirm he is either dense, an ass, or a troll. Nothing more or less. He is not offering _any_ data or observations of any kind, just boorish nonsense...mhull wrote:And your tests confirm Bob is wrong?Milos wrote:Just a reply from the previous thread
"More on Bob's formula...
speedup_2=1+0.7=1.7
speedup_32=1+31*0.7=22.7
speedup_64=1+63*0.7=45.1
speedup_64/speedup32=1.99!!!
speedup_2=1.7
17% more gain when going from 32 to 64 than from 1 to 2 cores."
I let the ppl make their own conclusions .
-
- Posts: 13447
- Joined: Wed Mar 08, 2006 9:02 pm
- Location: Dallas, Texas
- Full name: Matthew Hull
Re: SMP speed up
It was a dig, since he ran no tests.bob wrote:No, his tests confirm he is either dense, an ass, or a troll. Nothing more or less. He is not offering _any_ data or observations of any kind, just boorish nonsense...mhull wrote:And your tests confirm Bob is wrong?Milos wrote:Just a reply from the previous thread
"More on Bob's formula...
speedup_2=1+0.7=1.7
speedup_32=1+31*0.7=22.7
speedup_64=1+63*0.7=45.1
speedup_64/speedup32=1.99!!!
speedup_2=1.7
17% more gain when going from 32 to 64 than from 1 to 2 cores."
I let the ppl make their own conclusions .
Matthew Hull