threading

Discussion of chess software programming and technical issues.

Moderators: bob, hgm, Harvey Williamson

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
Post Reply
flok

threading

Post by flok » Thu Mar 03, 2016 10:28 pm

I wondered why adding more threads/cores to my chess program was so far away from linear speed increase.
So I added a thread which samples every 250ms the number of idle thread-slots (on a 6 core + ht system) and the amount of cpu time (user + sys) used in that slice.
In a nice graph:

Image

x-axis is sample number, starting 250ms after the search started

So yes there are some moments where there are one or more threads idle, but they are never longer than the 250ms interval (well maybe 499ms).
System overhead is also very low.
Conclusion: it must be a locking issue.
Next step: running it through mutrace and see which locks are holding things back.

jdart
Posts: 3956
Joined: Fri Mar 10, 2006 4:23 am
Location: http://www.arasanchess.org

Re: threading

Post by jdart » Fri Mar 04, 2016 12:23 am

250ms is actually a long time to have a thread idle. So I would not conclude your problem is lock performance. It sounds more like an algorithm issue.

I have found Oprofile (http://oprofile.sourceforge.net) helpful for measuring performance bottlenecks. Intel Parallel Studio is also very good but can be complex to use/understand.

--Jon

Joost Buijs
Posts: 1074
Joined: Thu Jul 16, 2009 8:47 am
Location: Almere, The Netherlands

Re: threading

Post by Joost Buijs » Fri Mar 04, 2016 4:13 pm

I've never timed in my program how long threads can be idle, 250ms seems long but it can happen in YBW without helpful master when a master is ready and sits waiting for his slaves to finish.

The bad speedup you see can also be caused by different threads poking into the same cache-line, this is something I experienced in the past when I first started with SMP.
You have to make sure that the data structures for each thread are separated at least 1 cache-line (64 bytes on Intel i7) apart from each other.

flok

Re: threading

Post by flok » Sun Mar 06, 2016 9:22 pm

Well, a thread may sit idle for 250ms. Because of the 250ms samplerate we don't know.
I'll give 100ms a try.

jdart
Posts: 3956
Joined: Fri Mar 10, 2006 4:23 am
Location: http://www.arasanchess.org

Re: threading

Post by jdart » Mon Mar 07, 2016 12:40 am

Tools like Oprofile can use monitoring features built into the CPU. This is much more efficient and accurate than your sampling method.

--Jon

mar
Posts: 2122
Joined: Fri Nov 26, 2010 1:00 pm
Location: Czech Republic
Full name: Martin Sedlak

Re: threading

Post by mar » Wed Mar 09, 2016 1:04 pm

Joost Buijs wrote:The bad speedup you see can also be caused by different threads poking into the same cache-line, this is something I experienced in the past when I first started with SMP.
You have to make sure that the data structures for each thread are separated at least 1 cache-line (64 bytes on Intel i7) apart from each other.
Yes, this is called false sharing and can totally kill performance,
but this issue doesn't arise if n threads are reading the same block of memory.
Writes are of course problematic because they invalidate all cachelines on other cores that point to same memory (assuming per-core caches).

Post Reply