Measure of SMP scalability

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

Edsel Apostol
Posts: 803
Joined: Mon Jul 17, 2006 5:53 am
Full name: Edsel Apostol

Measure of SMP scalability

Post by Edsel Apostol »

There seems to be no agreement yet regarding how to measure SMP scalability in computer chess. I can think of 3 things that could be used to measure scalability.

NPS-wise
This seems to be the most simple and obvious one. It simply measures the CPU utilization of the algorithm used. The problem with this is that it doesn't take into account if the amount of work being measured is just redundant in the case of a shared hash table implementation.

Time to depth
This measures the capability of the SMP algorithm to search efficiently. This is a good measure, better than just NPS-wise, but it will only be most accurate if the algorithm traversed the same search tree as the serial one. In the case of modern chess engines, SMP algorithm doesn't search the same tree as the serial one, some prunes more, and some prunes less when in SMP for example.

In some cases, in order to gain better performance, one can widen the search a bit when using SMP to compensate for excessive pruning/reduction. This will improve performance for the reason that at higher depths there is not much compensation for searching deeper, it is better to search wider and not miss critical variations.

Performance/Elo based
This seems to be the most logical measurement to me since SMP algorithm in modern chess engines is more complex to be measured by just "Time to depth" and "NPS-wise".
mvk
Posts: 589
Joined: Tue Jun 04, 2013 10:15 pm

Re: Measure of SMP scalability

Post by mvk »

Edsel Apostol wrote:There seems to be no agreement yet regarding how to measure SMP scalability in computer chess. I can think of 3 things that could be used to measure scalability.

NPS-wise
Time to depth
Performance/Elo based
This dilemma doesn't exist from the user point of view.
Daniel Shawul
Posts: 4185
Joined: Tue Mar 14, 2006 11:34 am
Location: Ethiopia

Re: Measure of SMP scalability

Post by Daniel Shawul »

Nps scaling - Is a good measurement to see if you are not loosing speed to caching and synchronization problems etc. Ideally it should be perfect and in general can be used as an upper bound on expected time scaling.

Time scaling - Ratios of times taken to complete a given work. It is usually assumed work~f(depth) because usually you don't change search algorithms. Otherwise work~=f(depth,width,....,stopping_time,tilt_of_the_moon etc..). You can roughly make sure same work is being done by stopping when certain number of nodes are searched hence work~=f(nodes) may be a better measure when algorithms differ.

Elo based - A no-brainer.

There is no point to debate this because I am sure every one knows how to measure things properly if confronted with a certain situation.
abulmo
Posts: 151
Joined: Thu Nov 12, 2009 6:31 pm

Re: Measure of SMP scalability

Post by abulmo »

Edsel Apostol wrote: Time to depth
This measures the capability of the SMP algorithm to search efficiently. This is a good measure, better than just NPS-wise, but it will only be most accurate if the algorithm traversed the same search tree as the serial one. In the case of modern chess engines, SMP algorithm doesn't search the same tree as the serial one, some prunes more, and some prunes less when in SMP for example.
.
What I like here is to use the following formula:
t(n) = W/n + C.
n is the number of threads;
t(n) the time spent
Once you have measured a few t(n) for various number of n (threads), you can compute W and C, and get an idea of the scalability of your program. Indeed:
W+C = t(1)
C = t(infinity)
C/(W+C) = theoretical upper limit of achievable acceleration.
Edsel Apostol wrote: Performance/Elo based
Elo performance is related to speed increase, so that doubling the speed is about a 50 Elo improvement. When dealing with speed improvement, which is the only thing SMP provides, Elo improvement is not a necessary tool, IMHO.
Richard
syzygy
Posts: 5566
Joined: Tue Feb 28, 2012 11:56 pm

Re: Measure of SMP scalability

Post by syzygy »

abulmo wrote:When dealing with speed improvement, which is the only thing SMP provides
Edsel's question essentially is: how do we measure increase in "speed" due to SMP.

What is "speed"?

NPS is speed, but it does not tell everything.

Time-to-depth is speed, but it does not tell everything. See the "General Topics" forum where this has been discussed at too much length in two threads.

Time to a certain quality of search (basically Elo) is speed and obviously the most relevant speed (unless one is interested in a particular aspect of the SMP implementation, e.g. synchronisation issues are more directly related to NPS increase).
User avatar
michiguel
Posts: 6401
Joined: Thu Mar 09, 2006 8:30 pm
Location: Chicago, Illinois, USA

Re: Measure of SMP scalability

Post by michiguel »

abulmo wrote:
Edsel Apostol wrote: Time to depth
This measures the capability of the SMP algorithm to search efficiently. This is a good measure, better than just NPS-wise, but it will only be most accurate if the algorithm traversed the same search tree as the serial one. In the case of modern chess engines, SMP algorithm doesn't search the same tree as the serial one, some prunes more, and some prunes less when in SMP for example.
.
What I like here is to use the following formula:
t(n) = W/n + C.
n is the number of threads;
t(n) the time spent
Once you have measured a few t(n) for various number of n (threads), you can compute W and C, and get an idea of the scalability of your program. Indeed:
W+C = t(1)
C = t(infinity)
C/(W+C) = theoretical upper limit of achievable acceleration.
Unfortunately, it is not that simple, because the Amdahl law (which is what you imply) does not apply to chess engines. That assumes the same work to be done for the parallel and single thread programs. We do not have that. The parallel program explores different branches, many times increasing the work adding extra inefficiency, and other times finding "short cuts".

That is the reason why the data does not fit well the Amdahl's formula.

Miguel
Edsel Apostol wrote: Performance/Elo based
Elo performance is related to speed increase, so that doubling the speed is about a 50 Elo improvement. When dealing with speed improvement, which is the only thing SMP provides, Elo improvement is not a necessary tool, IMHO.
User avatar
michiguel
Posts: 6401
Joined: Thu Mar 09, 2006 8:30 pm
Location: Chicago, Illinois, USA

Re: Measure of SMP scalability

Post by michiguel »

michiguel wrote:
abulmo wrote:
Edsel Apostol wrote: Time to depth
This measures the capability of the SMP algorithm to search efficiently. This is a good measure, better than just NPS-wise, but it will only be most accurate if the algorithm traversed the same search tree as the serial one. In the case of modern chess engines, SMP algorithm doesn't search the same tree as the serial one, some prunes more, and some prunes less when in SMP for example.
.
What I like here is to use the following formula:
t(n) = W/n + C.
n is the number of threads;
t(n) the time spent
Once you have measured a few t(n) for various number of n (threads), you can compute W and C, and get an idea of the scalability of your program. Indeed:
W+C = t(1)
C = t(infinity)
C/(W+C) = theoretical upper limit of achievable acceleration.
Unfortunately, it is not that simple, because the Amdahl law (which is what you imply) does not apply to chess engines. That assumes the same work to be done for the parallel and single thread programs. We do not have that. The parallel program explores different branches, many times increasing the work adding extra inefficiency, and other times finding "short cuts".

That is the reason why the data does not fit well the Amdahl's formula.

Miguel
Edsel Apostol wrote: Performance/Elo based
Elo performance is related to speed increase, so that doubling the speed is about a 50 Elo improvement. When dealing with speed improvement, which is the only thing SMP provides, Elo improvement is not a necessary tool, IMHO.
There was a discussion about it a while ago:
http://www.talkchess.com/forum/viewtopi ... 40&t=36082

There I made an attempt to model the behavior, but we did not follow that up.

Miguel
Sven
Posts: 4052
Joined: Thu May 15, 2008 9:57 pm
Location: Berlin, Germany
Full name: Sven Schüle

Re: Measure of SMP scalability

Post by Sven »

Would it help to define something like an "effective SMP scaling factor" as follows:

- Start with single-CPU version of a candidate engine, play games against a fixed set of opponents with a given time control and obtain an ELO rating.
- Repeat with N CPUs for the candidate engine but with unchanged number of CPUs for the opponents, and measure the ELO improvement X.
- Now find out how much faster the 1 CPU version would need to be for the same ELO improvement X. (You would usually give more time to the candidate engine than to its opponents to simulate that.) The speed factor you need to reach X is M.
- Then M/N would be the "effective SMP scaling factor" of the candidate engine for N CPUs.

Does that make sense to anyone?

Sven
User avatar
michiguel
Posts: 6401
Joined: Thu Mar 09, 2006 8:30 pm
Location: Chicago, Illinois, USA

Re: Measure of SMP scalability

Post by michiguel »

Sven Schüle wrote:Would it help to define something like an "effective SMP scaling factor" as follows:

- Start with single-CPU version of a candidate engine, play games against a fixed set of opponents with a given time control and obtain an ELO rating.
- Repeat with N CPUs for the candidate engine but with unchanged number of CPUs for the opponents, and measure the ELO improvement X.
- Now find out how much faster the 1 CPU version would need to be for the same ELO improvement X. (You would usually give more time to the candidate engine than to its opponents to simulate that.) The speed factor you need to reach X is M.
- Then M/N would be the "effective SMP scaling factor" of the candidate engine.

Does that make sense to anyone?

Sven
Absolutely Yes, that is the perfect way to express it. It could also be either Effective/Equivalent/Apparent Speed Up.

The question of how much faster the 1x engine should be, must be answered with an extrapolation from at least two points, one faster and one slower than what is required.

Miguel

Miguel
AlvaroBegue
Posts: 931
Joined: Tue Mar 09, 2010 3:46 pm
Location: New York
Full name: Álvaro Begué (RuyDos)

Re: Measure of SMP scalability

Post by AlvaroBegue »

michiguel wrote: The question of how much faster the 1x engine should be, must be answered with an extrapolation from at least two points, one faster and one slower than what is required.
The word you are looking for is "interpolation".