Measuring SF idle time

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

zullil
Posts: 6442
Joined: Tue Jan 09, 2007 12:31 am
Location: PA USA
Full name: Louis Zulli

Measuring SF idle time

Post by zullil »

Since my last post seems to have generated mostly contention and confusion :wink:

It seems obvious to me that time a search thread spends idling (ie, waiting to search) is time we'd like to minimize. So let's avoid nodes-per-second, time-to-depth, elo (just for now!) and focus on a direct numerical measurement of idleness. As far as I know, SF does not calculate or report such a number.

Maybe a UCI option (Report Idle Time, default false) might help right now? Just a thought.

[EDIT] As a percentage of total time, obviously
zullil
Posts: 6442
Joined: Tue Jan 09, 2007 12:31 am
Location: PA USA
Full name: Louis Zulli

Re: Measuring SF idle time

Post by zullil »

zullil wrote:Since my last post seems to have generated mostly contention and confusion :wink:

It seems obvious to me that time a search thread spends idling (ie, waiting to search) is time we'd like to minimize. So let's avoid nodes-per-second, time-to-depth, elo (just for now!) and focus on a direct numerical measurement of idleness. As far as I know, SF does not calculate or report such a number.

Maybe a UCI option (Report Idle Time, default false) might help right now? Just a thought.

[EDIT] As a percentage of total time, obviously
Been following a thread (started by Gary Linscott at FishCooking) about profiling the SF Threads mutex, using a utility called mutrace. Here's output, which I'm still trying to process:

Code: Select all

mutrace ./stockfish bench 16384 16 30000 default time

===========================
Total time (ms) : 1110019
Nodes searched  : 5514865805
Nodes/second    : 4968262

mutrace: Showing statistics for process stockfish (pid 22717).
mutrace: 147 mutexes used.

Mutex #9 (0x0x754f80) first referenced by:
	/usr/lib/mutrace/libmutrace.so(pthread_mutex_init+0xf2) [0x7f5a09ec94b2]
	./stockfish() [0x410f22]

Mutex #48 (0x0x289e2a8) first referenced by:
	/usr/lib/mutrace/libmutrace.so(pthread_mutex_init+0xf2) [0x7f5a09ec94b2]
	./stockfish(_ZN6ThreadC2Ev+0x43) [0x42f7b3]

Mutex #84 (0x0x245a7d8) first referenced by:
	/usr/lib/mutrace/libmutrace.so(pthread_mutex_init+0xf2) [0x7f5a09ec94b2]
	./stockfish(_ZN6ThreadC2Ev+0x43) [0x42f7b3]

Mutex #67 (0x0x267c5d8) first referenced by:
	/usr/lib/mutrace/libmutrace.so(pthread_mutex_init+0xf2) [0x7f5a09ec94b2]
	./stockfish(_ZN6ThreadC2Ev+0x43) [0x42f7b3]

Mutex #104 (0x0x2238930) first referenced by:
	/usr/lib/mutrace/libmutrace.so(pthread_mutex_init+0xf2) [0x7f5a09ec94b2]
	./stockfish(_ZN6ThreadC2Ev+0x43) [0x42f7b3]

Mutex #30 (0x0x2abffa8) first referenced by:
	/usr/lib/mutrace/libmutrace.so(pthread_mutex_init+0xf2) [0x7f5a09ec94b2]
	./stockfish(_ZN6ThreadC2Ev+0x43) [0x42f7b3]

Mutex #69 (0x0x267c670) first referenced by:
	/usr/lib/mutrace/libmutrace.so(pthread_mutex_init+0xf2) [0x7f5a09ec94b2]
	./stockfish(_ZN6ThreadC2Ev+0x43) [0x42f7b3]

Mutex #105 (0x0x22389c8) first referenced by:
	/usr/lib/mutrace/libmutrace.so(pthread_mutex_init+0xf2) [0x7f5a09ec94b2]
	./stockfish(_ZN6ThreadC2Ev+0x43) [0x42f7b3]

Mutex #11 (0x0x2ce1ca8) first referenced by:
	/usr/lib/mutrace/libmutrace.so(pthread_mutex_init+0xf2) [0x7f5a09ec94b2]
	./stockfish(_ZN6ThreadC2Ev+0x43) [0x42f7b3]

Mutex #85 (0x0x245a870) first referenced by:
	/usr/lib/mutrace/libmutrace.so(pthread_mutex_init+0xf2) [0x7f5a09ec94b2]
	./stockfish(_ZN6ThreadC2Ev+0x43) [0x42f7b3]

mutrace: Showing 10 most contended mutexes:

 Mutex #   Locked  Changed    Cont. tot.Time[ms] avg.Time[ms] max.Time[ms]  Flags
       9 74284822 47886994 32863145   470371.620        0.006        3.052 Mx.--.
      48 14684094  4878136  1795818    45965.834        0.003        0.859 Mx.--.
      84 14385996  4839990  1769405    45565.200        0.003        0.777 Mx.--.
      67 13786399  4639289  1701291    43363.932        0.003        0.677 Mx.--.
     104 11304773  3854256  1388127    35428.958        0.003        0.347 Mx.--.
      30 14474828  3355148  1212946    41350.212        0.003        0.646 Mx.--.
      69 11466150  3102239  1105727    33890.025        0.003        0.757 Mx.--.
     105 11426777  3128527  1104254    33842.007        0.003        1.032 Mx.--.
      11 14700876  3151306  1096918    36025.554        0.002        0.806 Mx.--.
      85 11460532  3057781  1089505    33853.965        0.003        0.386 Mx.--.
     ...      ...      ...      ...          ...          ...          ... ||||||
                                                                           /|||||
          Object:                                     M = Mutex, W = RWLock /||||
           State:                                 x = dead, ! = inconsistent /|||
             Use:                                 R = used in realtime thread /||
      Mutex Type:                 r = RECURSIVE, e = ERRRORCHECK, a = ADAPTIVE /|
  Mutex Protocol:                                      i = INHERIT, p = PROTECT /
     RWLock Kind: r = PREFER_READER, w = PREFER_WRITER, W = PREFER_WRITER_NONREC 

mutrace: Note that the flags column R is only valid in --track-rt mode!

mutrace: Total runtime is 1117053.774 ms.
If I'm reading this correctly, mutex #9 (Threads.mutex) was contended for 32863145 times, which ate up 470371.620 ms in a bench run that took 1110019 ms total---about 40%?

Seems like a possible contributor to SF's 16-core problems?

The process being profiled was SF's standard 37 position bench, with 16GB hash and 16 threads, each position searched for 30 seconds. Feedback appreciated, since I'm pretty ignorant here.:wink:
gladius
Posts: 568
Joined: Tue Dec 12, 2006 10:10 am
Full name: Gary Linscott

Re: Measuring SF idle time

Post by gladius »

zullil wrote:If I'm reading this correctly, mutex #9 (Threads.mutex) was contended for 32863145 times, which ate up 470371.620 ms in a bench run that took 1110019 ms total---about 40%?

Seems like a possible contributor to SF's 16-core problems?

The process being profiled was SF's standard 37 position bench, with 16GB hash and 16 threads, each position searched for 30 seconds. Feedback appreciated, since I'm pretty ignorant here.:wink:
It was contended for 470 seconds, yes, but it's across 16 different threads. So each thread probably averaged 29.5 seconds or so waiting on that mutex. Still a lot!

I just put together a patch that removes the global mutex https://github.com/glinscott/Stockfish/ ... x?expand=1. If you could give it a shot on your 16 core machine, I'd be very interested to see if it goes faster or not.
zullil
Posts: 6442
Joined: Tue Jan 09, 2007 12:31 am
Location: PA USA
Full name: Louis Zulli

Re: Measuring SF idle time

Post by zullil »

gladius wrote:
zullil wrote:If I'm reading this correctly, mutex #9 (Threads.mutex) was contended for 32863145 times, which ate up 470371.620 ms in a bench run that took 1110019 ms total---about 40%?

Seems like a possible contributor to SF's 16-core problems?

The process being profiled was SF's standard 37 position bench, with 16GB hash and 16 threads, each position searched for 30 seconds. Feedback appreciated, since I'm pretty ignorant here.:wink:
It was contended for 470 seconds, yes, but it's across 16 different threads. So each thread probably averaged 29.5 seconds or so waiting on that mutex. Still a lot!

I just put together a patch that removes the global mutex https://github.com/glinscott/Stockfish/ ... x?expand=1. If you could give it a shot on your 16 core machine, I'd be very interested to see if it goes faster or not.

Is it this: Stockfish-bda52041304218216f96a9f4ddc0f4ad7bbb8ca9

If so, it compiled but seems to hang about halfway through the following:

Code: Select all

./stockfish bench 1024 16 2000 default time
Can't do more til later. Thanks.
gladius
Posts: 568
Joined: Tue Dec 12, 2006 10:10 am
Full name: Gary Linscott

Re: Measuring SF idle time

Post by gladius »

zullil wrote:
gladius wrote:
zullil wrote:If I'm reading this correctly, mutex #9 (Threads.mutex) was contended for 32863145 times, which ate up 470371.620 ms in a bench run that took 1110019 ms total---about 40%?

Seems like a possible contributor to SF's 16-core problems?

The process being profiled was SF's standard 37 position bench, with 16GB hash and 16 threads, each position searched for 30 seconds. Feedback appreciated, since I'm pretty ignorant here.:wink:
It was contended for 470 seconds, yes, but it's across 16 different threads. So each thread probably averaged 29.5 seconds or so waiting on that mutex. Still a lot!

I just put together a patch that removes the global mutex https://github.com/glinscott/Stockfish/ ... x?expand=1. If you could give it a shot on your 16 core machine, I'd be very interested to see if it goes faster or not.

Is it this: Stockfish-bda52041304218216f96a9f4ddc0f4ad7bbb8ca9

If so, it compiled but seems to hang about halfway through the following:

Code: Select all

./stockfish bench 1024 16 2000 default time
Can't do more til later. Thanks.
Doh, thanks for giving it a try though! Sounds like it's hitting some deadlock. The joys of multi-threaded programming.