So far, it appears to me that there is little performance difference between condition variables and semaphores, at least where the semaphore is not shared among processes running on different machines.
Semaphore usage means no missed signals and no spurious wake-ups, big advantages in my opinion.
Note: Unnamed semaphores might be even faster than named semaphores, but have been deprecated for at least ten or so years. Of course, with named semaphores, the program may need to generate unique names. That's easily done; see the Semaphore constructor I posted earlier.
Thread synchronization questions for experts
Moderators: hgm, Rebel, chrisw
-
- Posts: 4675
- Joined: Mon Mar 13, 2006 7:43 pm
Performance data point
Performance data point
Using a semaphore, the average elapsed time for a complete enqueue / post / wait / dequeue cycle is about 3.46 microseconds (ca. 289 KHz) when running on my 2006 Mac Pro. I'd guess that most of this is used for allocation and deallocation operations which could be avoided depending upon how the queue is realized.
Using a semaphore, the average elapsed time for a complete enqueue / post / wait / dequeue cycle is about 3.46 microseconds (ca. 289 KHz) when running on my 2006 Mac Pro. I'd guess that most of this is used for allocation and deallocation operations which could be avoided depending upon how the queue is realized.
Code: Select all
[2015-04-21 22:28:20.113] < test
[2015-04-21 22:28:20.124] WriterTask created
[2015-04-21 22:28:20.124] WriterTask::JobLoop begun
[2015-04-21 22:28:23.579] WriterTask::JobLoop eventcount: 1000001
[2015-04-21 22:28:23.579] WriterTask::JobLoop ended
[2015-04-21 22:28:23.579] WriterTask destroyed
-
- Posts: 4675
- Joined: Mon Mar 13, 2006 7:43 pm
Re: Performance data point
The same, now running under Linux on a old, cheap Intel Atom single core, two hyperthread notebook:
Average cycle time: 802 nanoseconds (125 MHz).
Code: Select all
[2015-04-22 01:28:23.845] < test
[2015-04-22 01:28:23.848] WriterTask created
[2015-04-22 01:28:23.848] WriterTask::JobLoop begun
[2015-04-22 01:28:24.650] WriterTask::JobLoop eventcount: 1000001
[2015-04-22 01:28:24.650] WriterTask::JobLoop ended
[2015-04-22 01:28:24.650] WriterTask destroyed
-
- Posts: 2272
- Joined: Mon Sep 29, 2008 1:50 am
Re: Condition variables vs semaphores
I don't seem what the problem could be. It is trivial to emulate semaphores using condition variables. The opposite appears to be much harder.sje wrote: Semaphore usage means no missed signals and no spurious wake-ups, big advantages in my opinion.
http://www.google.be/url?sa=t&rct=j&q=& ... 1109,d.bGQ
Ideas=science. Simplification=engineering.
Without ideas there is nothing to simplify.
Without ideas there is nothing to simplify.
-
- Posts: 4675
- Joined: Mon Mar 13, 2006 7:43 pm
Implementing WriterTask
Implementing WriterTask
I'm now finishing up the implementation and integration of the semaphore driven version of WriterTask into Symbolic.
One change I did have to make was to add a command to WriterTask for data synchronization. This was needed to allow the calling thread to block until any pending output request had completed. This command is used in only one place in the code: just prior to printing a command prompt while waiting for interactive input.
Next up is the re-write of the multithreaded perft() code. This will use two queues, each with their own mutex and semaphore. The first will used for the master thread to send requests to the worker threads, the second to be used for the worker threads to post results to be read by the master thread.
I'm now finishing up the implementation and integration of the semaphore driven version of WriterTask into Symbolic.
One change I did have to make was to add a command to WriterTask for data synchronization. This was needed to allow the calling thread to block until any pending output request had completed. This command is used in only one place in the code: just prior to printing a command prompt while waiting for interactive input.
Next up is the re-write of the multithreaded perft() code. This will use two queues, each with their own mutex and semaphore. The first will used for the master thread to send requests to the worker threads, the second to be used for the worker threads to post results to be read by the master thread.
-
- Posts: 4675
- Joined: Mon Mar 13, 2006 7:43 pm
Better handling of sem_wait()
Better handling of sem_wait()
It is possible for a call to sem_wait() to fail if a signal is sent to the calling thread during the wait. Therefore, each call to sem_wait() should be wrapped in a retry loop.
Revised code:
It is possible for a call to sem_wait() to fail if a signal is sent to the calling thread during the wait. Therefore, each call to sem_wait() should be wrapped in a retry loop.
Revised code:
Code: Select all
void Semaphore::Wait(void)
{
int rc;
do
{
rc = sem_wait((sem_t *) vsptr);
} while (rc && (errno == EINTR));
if (rc)
Die("Semaphore::Wait", "sem_wait");
}
-
- Posts: 4675
- Joined: Mon Mar 13, 2006 7:43 pm
Another timing data point
Another timing data point
I've finished the first version of the new multithreaded perft() which uses semaphores to assist with data and control synchronization. It works, although the code still needs some beautification.
The implementation uses a pair of channels with each channel having one queue, one mutex, and one semaphore. The first channel handles passing commands from the controlling master to the worker threads and the second channel handles passing results from the worker threads to the controlling master.
The number of worker threads is set dynamically from one up to 256 with the default being the count of hardware cores or hyperthreads. Tests show that the per-worker thread overhead time for setup/shutdown is about 150 microseconds on my 2006 2.67 GHz quad core Mac Pro.
Because the overhead time is fairly small, there's not much savings gotten from a one-time thread set creation at program start vs creating thread sets as needed. I prefer the latter approach because I find it useful for the program to be able to change the distribution thread count as needed vs having it hardwired to the number of cores.
I've finished the first version of the new multithreaded perft() which uses semaphores to assist with data and control synchronization. It works, although the code still needs some beautification.
The implementation uses a pair of channels with each channel having one queue, one mutex, and one semaphore. The first channel handles passing commands from the controlling master to the worker threads and the second channel handles passing results from the worker threads to the controlling master.
The number of worker threads is set dynamically from one up to 256 with the default being the count of hardware cores or hyperthreads. Tests show that the per-worker thread overhead time for setup/shutdown is about 150 microseconds on my 2006 2.67 GHz quad core Mac Pro.
Because the overhead time is fairly small, there's not much savings gotten from a one-time thread set creation at program start vs creating thread sets as needed. I prefer the latter approach because I find it useful for the program to be able to change the distribution thread count as needed vs having it hardwired to the number of cores.
-
- Posts: 4675
- Joined: Mon Mar 13, 2006 7:43 pm
Revised Semaphore() code
Revised Semaphore() code
The mktemp() routine has been deprecated. The best replacement is to roll your own unique string generator.
The mktemp() routine has been deprecated. The best replacement is to roll your own unique string generator.
Code: Select all
Semaphore::Semaphore(void)
{
name = "Semaphore." + UniqueString();
vsptr = (void *) sem_open(name.c_str(), (O_CREAT | O_EXCL), (S_IRUSR | S_IWUSR), 0);
if (!vsptr)
Die("Semaphore::Semaphore", "sem_open");
}
Semaphore::~Semaphore(void)
{
if (sem_close((sem_t *) vsptr))
Die("Semaphore::~Semaphore", "sem_close");
if (sem_unlink(name.c_str()))
Die("Semaphore::~Semaphore", "sem_unlink");
}
void Semaphore::Post(void)
{
if (sem_post((sem_t *) vsptr))
Die("Semaphore::Post", "sem_post");
}
void Semaphore::Wait(void)
{
int rc;
do
{
rc = sem_wait((sem_t *) vsptr);
} while (rc && (errno == EINTR));
if (rc)
Die("Semaphore::Wait", "sem_wait");
}
-
- Posts: 4675
- Joined: Mon Mar 13, 2006 7:43 pm
Re: Revised Semaphore() code
Code: Select all
ui UniqueOrdinal(void)
{
static pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
static ui ordinal = 0;
ui result;
pthread_mutex_lock(&mutex);
result = ordinal++;
pthread_mutex_unlock(&mutex);
return result;
}
std::string UniqueString(void)
{
const ui pid = FetchPID();
const ui ordinal = UniqueOrdinal();
return EncodeHexadecimalUi32(pid) + "." + EncodeHexadecimalUi32(ordinal);
}