There seems to be no agreement yet regarding how to measure SMP scalability in computer chess. I can think of 3 things that could be used to measure scalability.
NPS-wise
This seems to be the most simple and obvious one. It simply measures the CPU utilization of the algorithm used. The problem with this is that it doesn't take into account if the amount of work being measured is just redundant in the case of a shared hash table implementation.
Time to depth
This measures the capability of the SMP algorithm to search efficiently. This is a good measure, better than just NPS-wise, but it will only be most accurate if the algorithm traversed the same search tree as the serial one. In the case of modern chess engines, SMP algorithm doesn't search the same tree as the serial one, some prunes more, and some prunes less when in SMP for example.
In some cases, in order to gain better performance, one can widen the search a bit when using SMP to compensate for excessive pruning/reduction. This will improve performance for the reason that at higher depths there is not much compensation for searching deeper, it is better to search wider and not miss critical variations.
Performance/Elo based
This seems to be the most logical measurement to me since SMP algorithm in modern chess engines is more complex to be measured by just "Time to depth" and "NPS-wise".
Measure of SMP scalability
Moderators: hgm, Rebel, chrisw
-
- Posts: 803
- Joined: Mon Jul 17, 2006 5:53 am
- Full name: Edsel Apostol
-
- Posts: 589
- Joined: Tue Jun 04, 2013 10:15 pm
Re: Measure of SMP scalability
This dilemma doesn't exist from the user point of view.Edsel Apostol wrote:There seems to be no agreement yet regarding how to measure SMP scalability in computer chess. I can think of 3 things that could be used to measure scalability.
NPS-wise
Time to depth
Performance/Elo based
-
- Posts: 4185
- Joined: Tue Mar 14, 2006 11:34 am
- Location: Ethiopia
Re: Measure of SMP scalability
Nps scaling - Is a good measurement to see if you are not loosing speed to caching and synchronization problems etc. Ideally it should be perfect and in general can be used as an upper bound on expected time scaling.
Time scaling - Ratios of times taken to complete a given work. It is usually assumed work~f(depth) because usually you don't change search algorithms. Otherwise work~=f(depth,width,....,stopping_time,tilt_of_the_moon etc..). You can roughly make sure same work is being done by stopping when certain number of nodes are searched hence work~=f(nodes) may be a better measure when algorithms differ.
Elo based - A no-brainer.
There is no point to debate this because I am sure every one knows how to measure things properly if confronted with a certain situation.
Time scaling - Ratios of times taken to complete a given work. It is usually assumed work~f(depth) because usually you don't change search algorithms. Otherwise work~=f(depth,width,....,stopping_time,tilt_of_the_moon etc..). You can roughly make sure same work is being done by stopping when certain number of nodes are searched hence work~=f(nodes) may be a better measure when algorithms differ.
Elo based - A no-brainer.
There is no point to debate this because I am sure every one knows how to measure things properly if confronted with a certain situation.
-
- Posts: 151
- Joined: Thu Nov 12, 2009 6:31 pm
Re: Measure of SMP scalability
What I like here is to use the following formula:Edsel Apostol wrote: Time to depth
This measures the capability of the SMP algorithm to search efficiently. This is a good measure, better than just NPS-wise, but it will only be most accurate if the algorithm traversed the same search tree as the serial one. In the case of modern chess engines, SMP algorithm doesn't search the same tree as the serial one, some prunes more, and some prunes less when in SMP for example.
.
t(n) = W/n + C.
n is the number of threads;
t(n) the time spent
Once you have measured a few t(n) for various number of n (threads), you can compute W and C, and get an idea of the scalability of your program. Indeed:
W+C = t(1)
C = t(infinity)
C/(W+C) = theoretical upper limit of achievable acceleration.
Elo performance is related to speed increase, so that doubling the speed is about a 50 Elo improvement. When dealing with speed improvement, which is the only thing SMP provides, Elo improvement is not a necessary tool, IMHO.Edsel Apostol wrote: Performance/Elo based
Richard
-
- Posts: 5566
- Joined: Tue Feb 28, 2012 11:56 pm
Re: Measure of SMP scalability
Edsel's question essentially is: how do we measure increase in "speed" due to SMP.abulmo wrote:When dealing with speed improvement, which is the only thing SMP provides
What is "speed"?
NPS is speed, but it does not tell everything.
Time-to-depth is speed, but it does not tell everything. See the "General Topics" forum where this has been discussed at too much length in two threads.
Time to a certain quality of search (basically Elo) is speed and obviously the most relevant speed (unless one is interested in a particular aspect of the SMP implementation, e.g. synchronisation issues are more directly related to NPS increase).
-
- Posts: 6401
- Joined: Thu Mar 09, 2006 8:30 pm
- Location: Chicago, Illinois, USA
Re: Measure of SMP scalability
Unfortunately, it is not that simple, because the Amdahl law (which is what you imply) does not apply to chess engines. That assumes the same work to be done for the parallel and single thread programs. We do not have that. The parallel program explores different branches, many times increasing the work adding extra inefficiency, and other times finding "short cuts".abulmo wrote:What I like here is to use the following formula:Edsel Apostol wrote: Time to depth
This measures the capability of the SMP algorithm to search efficiently. This is a good measure, better than just NPS-wise, but it will only be most accurate if the algorithm traversed the same search tree as the serial one. In the case of modern chess engines, SMP algorithm doesn't search the same tree as the serial one, some prunes more, and some prunes less when in SMP for example.
.
t(n) = W/n + C.
n is the number of threads;
t(n) the time spent
Once you have measured a few t(n) for various number of n (threads), you can compute W and C, and get an idea of the scalability of your program. Indeed:
W+C = t(1)
C = t(infinity)
C/(W+C) = theoretical upper limit of achievable acceleration.
That is the reason why the data does not fit well the Amdahl's formula.
Miguel
Elo performance is related to speed increase, so that doubling the speed is about a 50 Elo improvement. When dealing with speed improvement, which is the only thing SMP provides, Elo improvement is not a necessary tool, IMHO.Edsel Apostol wrote: Performance/Elo based
-
- Posts: 6401
- Joined: Thu Mar 09, 2006 8:30 pm
- Location: Chicago, Illinois, USA
Re: Measure of SMP scalability
There was a discussion about it a while ago:michiguel wrote:Unfortunately, it is not that simple, because the Amdahl law (which is what you imply) does not apply to chess engines. That assumes the same work to be done for the parallel and single thread programs. We do not have that. The parallel program explores different branches, many times increasing the work adding extra inefficiency, and other times finding "short cuts".abulmo wrote:What I like here is to use the following formula:Edsel Apostol wrote: Time to depth
This measures the capability of the SMP algorithm to search efficiently. This is a good measure, better than just NPS-wise, but it will only be most accurate if the algorithm traversed the same search tree as the serial one. In the case of modern chess engines, SMP algorithm doesn't search the same tree as the serial one, some prunes more, and some prunes less when in SMP for example.
.
t(n) = W/n + C.
n is the number of threads;
t(n) the time spent
Once you have measured a few t(n) for various number of n (threads), you can compute W and C, and get an idea of the scalability of your program. Indeed:
W+C = t(1)
C = t(infinity)
C/(W+C) = theoretical upper limit of achievable acceleration.
That is the reason why the data does not fit well the Amdahl's formula.
MiguelElo performance is related to speed increase, so that doubling the speed is about a 50 Elo improvement. When dealing with speed improvement, which is the only thing SMP provides, Elo improvement is not a necessary tool, IMHO.Edsel Apostol wrote: Performance/Elo based
http://www.talkchess.com/forum/viewtopi ... 40&t=36082
There I made an attempt to model the behavior, but we did not follow that up.
Miguel
-
- Posts: 4052
- Joined: Thu May 15, 2008 9:57 pm
- Location: Berlin, Germany
- Full name: Sven Schüle
Re: Measure of SMP scalability
Would it help to define something like an "effective SMP scaling factor" as follows:
- Start with single-CPU version of a candidate engine, play games against a fixed set of opponents with a given time control and obtain an ELO rating.
- Repeat with N CPUs for the candidate engine but with unchanged number of CPUs for the opponents, and measure the ELO improvement X.
- Now find out how much faster the 1 CPU version would need to be for the same ELO improvement X. (You would usually give more time to the candidate engine than to its opponents to simulate that.) The speed factor you need to reach X is M.
- Then M/N would be the "effective SMP scaling factor" of the candidate engine for N CPUs.
Does that make sense to anyone?
Sven
- Start with single-CPU version of a candidate engine, play games against a fixed set of opponents with a given time control and obtain an ELO rating.
- Repeat with N CPUs for the candidate engine but with unchanged number of CPUs for the opponents, and measure the ELO improvement X.
- Now find out how much faster the 1 CPU version would need to be for the same ELO improvement X. (You would usually give more time to the candidate engine than to its opponents to simulate that.) The speed factor you need to reach X is M.
- Then M/N would be the "effective SMP scaling factor" of the candidate engine for N CPUs.
Does that make sense to anyone?
Sven
-
- Posts: 6401
- Joined: Thu Mar 09, 2006 8:30 pm
- Location: Chicago, Illinois, USA
Re: Measure of SMP scalability
Absolutely Yes, that is the perfect way to express it. It could also be either Effective/Equivalent/Apparent Speed Up.Sven Schüle wrote:Would it help to define something like an "effective SMP scaling factor" as follows:
- Start with single-CPU version of a candidate engine, play games against a fixed set of opponents with a given time control and obtain an ELO rating.
- Repeat with N CPUs for the candidate engine but with unchanged number of CPUs for the opponents, and measure the ELO improvement X.
- Now find out how much faster the 1 CPU version would need to be for the same ELO improvement X. (You would usually give more time to the candidate engine than to its opponents to simulate that.) The speed factor you need to reach X is M.
- Then M/N would be the "effective SMP scaling factor" of the candidate engine.
Does that make sense to anyone?
Sven
The question of how much faster the 1x engine should be, must be answered with an extrapolation from at least two points, one faster and one slower than what is required.
Miguel
Miguel
-
- Posts: 931
- Joined: Tue Mar 09, 2010 3:46 pm
- Location: New York
- Full name: Álvaro Begué (RuyDos)
Re: Measure of SMP scalability
The word you are looking for is "interpolation".michiguel wrote: The question of how much faster the 1x engine should be, must be answered with an extrapolation from at least two points, one faster and one slower than what is required.