Best way to handle input thread

stegemma · Post by **stegemma** » Sat Aug 09, 2014 11:31 am

In Satana i've lost a lot of time to make a separate thread working well, to handle standard input. When it started working, i've noticed that Satana keep taking 99% of the CPU time, even when in stand-by. I've added a usleep(100), just to let the software works in a correct way. That works on Windows, i've to try in Mac and Linux too.

In the source below, objStdIn is a thread object handling stdin:

Code: Select all

bool clsStdInThread&#58;&#58;Execute&#40;)
&#123;
	sfout.Push&#40;"# stdin thread is running");
	while&#40;true&#41;
	&#123;
		std&#58;&#58;string s;
		std&#58;&#58;getline&#40;std&#58;&#58;cin, s&#41;;
		if&#40;s.length&#40;))
			Push&#40;s.c_str&#40;));
	&#125;
	return true;
&#125;

Push just add s to my collection class (it's similar to std::vector).

Even stdout is handled by a separate thread, using an object that you see called "sfout", in my source. It collect output strings, protecting them with a mutex (mutOutput). In Satana implementation bDirectOutput is true:

Code: Select all

void clsTSafeOutput&#58;&#58;Push&#40;const clsString &Text, bool bCr&#41;
&#123;
	if&#40;mutOutput.TryLock&#40;))
	&#123;
		if&#40;bDirectOutput&#41;
		&#123;
			std&#58;&#58;cout << Text.GetString&#40;);
			if&#40;bCr&#41; std&#58;&#58;cout << std&#58;&#58;endl;
		&#125;else
		&#123;
			if&#40;!ssOutput.Count&#40;)) ssOutput.cAdd&#40;Text&#41;;
			else *&#40;ssOutput&#91;ssOutput.Count&#40;)-1&#93;) += Text;
			if&#40;bCr&#41; ssOutput.cAdd&#40;"");
		&#125;
		mutOutput.UnLock&#40;);
	&#125;
&#125;

This is the main loop of Satana (engine is an object of class clsEngine, that does the chess search):

Code: Select all

bool clsSatana&#58;&#58;Execute&#40;)
&#123;
	objStdIn.RunJob&#40;);
	for&#40;bool bOk = true; bOk; )
	&#123;
		int64_t TickNow = GetTicks&#40;);
		if&#40;engine.IsRunning&#40;))
		&#123;
			if&#40;TickEnd>0 && TickNow>TickEnd&#41;
			&#123;
				engine.Stop&#40;);
			&#125;else
			&#123;
				int64_t sec = &#40;TickNow-TickLast&#41;/CLOCKS_PER_SEC;
				if&#40;sec>=1&#41;
				&#123;
					TickLast = TickNow;
					int64_t sec2end = &#40;TickEnd - TickNow&#41;/CLOCKS_PER_SEC;
				&#125;
			&#125;
		&#125;
		else
		&#123;
			switch&#40;mode&#41;
			&#123;
				case enModeSet&#58;
					break;
				case enModePlayWhite&#58;
				case enModePlayBlack&#58;
				case enModeAutoPlay&#58;
				&#123;
					clsString sBestMove = engine.GetBestMove&#40;true&#41;;
					if&#40;sBestMove.Length&#40;))
					&#123;
						eng_clock = eng_clock - (&#40;TickNow - Tick0&#41;*1000&#41;/CLOCKS_PER_SEC;
						engine.UserMove&#40;sBestMove&#41;;
						sfout.Push&#40;clsString&#40;"move ") + sBestMove&#41;;
					&#125;
				&#125;
				break;
			&#125;
			usleep&#40;100&#41;; // <---- HERE'S THE "PATCH"
		&#125;
		clsString s = objStdIn.Pop&#40;);
		if&#40;s.Length&#40;))
		&#123;
			switch&#40;XBoardCommand&#40;s.TokenLeft&#40;' '), s.TokenRight&#40;' ')))
			&#123;
				case intQuit&#58; bOk = false; break;
			&#125;
		&#125;

		if&#40;bThinkOutput && engine.IsRunning&#40;))
		&#123;
			clsString sPV = engine.GetPV&#40;);
			if&#40;sPV!=sOldPV&#41;
			&#123;
				sOldPV = sPV;
				sfout.Push&#40;engine.GetPV&#40;)
				        + " &#91;ms " + clsString&#40;uint64_t&#40;&#40;TickEnd - TickNow&#41;*1000&#41;/CLOCKS_PER_SEC&#41;
				        + "-" + clsString&#40;uint64_t&#40;eng_clock&#41;) + "&#93;");
			&#125;
		&#125;
	&#125;
	return true;
&#125;

My questions are:

- how do you handle separate threads for input/output/engine?
- there is a better way to do it?

Stefano

jdart · Post by **jdart** » Sat Aug 09, 2014 3:48 pm

stegemma wrote:In Satana i've lost a lot of time to make a separate thread working well, to handle standard input. When it started working, i've noticed that Satana keep taking 99% of the CPU time, even when in stand-by.

Code looks ok to me at first glance. Your getline call should block so it shouldn't be consuming 99% CPU. If that is happening you should find out why. IMO though you should consider some kind of read poll with timeout - this will make it easier to terminate the input thread if it is blocked and you are trying to exit.

--Jon

kbhearn · Post by **kbhearn** » Sun Aug 10, 2014 12:45 am

It's his engine thread not his input thread that does the 99% cpu consumption. using usleep is one way to handle that, though not ideal. another would be to use a function from your thread library to put the thread to sleep (waiting on an object) or even just to let the thread exit and respawn it when it's time to think again if restart times are not a concern.

As far as worrying about the blocking read in the input thread, i think when the os is cleaning up your application it's handled anyways. (Or if the pipe dies because the application on the other side closes it, it should unblock with an error condition and exit). More to the point, the standard libraries don't provide polling so you have to go with different setups for windows and non-windows. It's a lot of nuisance to get rid of a functional but aesthetically unpleasing blocking read loop.

stegemma · Post by **stegemma** » Sun Aug 10, 2014 8:11 am

kbhearn wrote:It's his engine thread not his input thread that does the 99% cpu consumption. using usleep is one way to handle that, though not ideal. another would be to use a function from your thread library to put the thread to sleep (waiting on an object) or even just to let the thread exit and respawn it when it's time to think again if restart times are not a concern.

As far as worrying about the blocking read in the input thread, i think when the os is cleaning up your application it's handled anyways. (Or if the pipe dies because the application on the other side closes it, it should unblock with an error condition and exit). More to the point, the standard libraries don't provide polling so you have to go with different setups for windows and non-windows. It's a lot of nuisance to get rid of a functional but aesthetically unpleasing blocking read loop.

In fact is the main loop that have the problem. The stdin thread is almost always blocked and will be killed in clsSatana destructor.

I've found another indirect problem. The usleep is done only when engine is not running, so that the whole program takes 99% of the CPU only when thinking but... while thinking the main loop still use a lot of CPU that would be better used by the engine itself. Moving usleep out of the if speed-up the engine (in 6 seconds, it completes 4 moves check in 8 plies depth, instead of just 1).

Here's the modified code, with better clean-up of my debug code:

Code: Select all

bool clsSatana&#58;&#58;Execute&#40;)
&#123;
	sfout.Push&#40;"# Satana is running");
	PushCommand&#40;"new");
	objStdIn.RunJob&#40;);
	for&#40;bool bOk = true; bOk; )
	&#123;
		int64_t TickNow = GetTicks&#40;);

		if&#40;engine.IsRunning&#40;))
		&#123;
			if&#40;TickEnd>0 && TickNow>TickEnd&#41;
			&#123;
				sfout.Push&#40;"# time has expired");
				engine.Stop&#40;);
			&#125;
		&#125;
		else
		&#123;
			switch&#40;mode&#41;
			&#123;
				case enModeSet&#58;
					break;
				case enModePlayWhite&#58;
				case enModePlayBlack&#58;
				case enModeAutoPlay&#58;
				&#123;
					clsString sBestMove = engine.GetBestMove&#40;true&#41;;
					if&#40;sBestMove.Length&#40;))
					&#123;
						engine.UserMove&#40;sBestMove&#41;;
						sfout.Push&#40;clsString&#40;"move ") + sBestMove&#41;;
					&#125;
				&#125;
				break;
			&#125;
		&#125;
		usleep&#40;100&#41;;  // <--- MOVED OUT-SIDE THE IF
		clsString s = objStdIn.Pop&#40;);
		if&#40;s.Length&#40;))
		&#123;
			switch&#40;XBoardCommand&#40;s.TokenLeft&#40;' '), s.TokenRight&#40;' ')))
			&#123;
				case intQuit&#58; bOk = false; break;
			&#125;
		&#125;

		if&#40;bThinkOutput && engine.IsRunning&#40;))
		&#123;
			clsString sPV = engine.GetPV&#40;);
			if&#40;sPV!=sOldPV&#41;
			&#123;
				sOldPV = sPV;
				sfout.Push&#40;engine.GetPV&#40;));
			&#125;
		&#125;
	&#125;
	return true;
&#125;

Thanks to all for the answers.

elcabesa · Post by **elcabesa** » Sun Aug 10, 2014 11:23 am

you can use some type of semaphore/syncronization and not use thread sleep. using semaphore you don't have to poll after every sleep to understand if you have to start search or continue sleeping

hgm · Post by **hgm** » Sun Aug 10, 2014 1:43 pm

The reason to have a separate input thread is to be able to see commands that come during search without having to poll for them (so you have instant response to them, and no CPU overhead for polling). There are only very few commands that need to be processed that way. (UCI: stop, ponderhit, isready; WB: time, otim, move, '.' (periodic update), '?' (move now), lift.)

If I had to design it, I would just let the input thread only process those commands, to set an AbortFlag and TimeLimits, which the search would test. (E.g. during ponder or analysis the TimeLimits would be set to infinite. A Ponder hit would set time limits based on how much time was already spent on the move, and how much is left, while a move-now command would set the time limits to 'already expired'.) And then pass any other command through an internal pipe to the search thread, which would do blocking input on it between searches, and execute the commands by itself.

You could even prevent that the search thread has to read the clock by setting up a timer interrupt at the time the engine should start behaving differently (i.e. not start new iterations in the root, or not trying new moves in the root), and let the interrupt handler set a shared variable that defines the time-out severity, which could then be tested in any node.

stegemma · Post by **stegemma** » Sun Aug 10, 2014 7:39 pm

hgm wrote:The reason to have a separate input thread is to be able to see commands that come during search without having to poll for them (so you have instant response to them, and no CPU overhead for polling). There are only very few commands that need to be processed that way. (UCI: stop, ponderhit, isready; WB: time, otim, move, '.' (periodic update), '?' (move now), lift.)

If I had to design it, I would just let the input thread only process those commands, to set an AbortFlag and TimeLimits, which the search would test. (E.g. during ponder or analysis the TimeLimits would be set to infinite. A Ponder hit would set time limits based on how much time was already spent on the move, and how much is left, while a move-now command would set the time limits to 'already expired'.) And then pass any other command through an internal pipe to the search thread, which would do blocking input on it between searches, and execute the commands by itself.

You could even prevent that the search thread has to read the clock by setting up a timer interrupt at the time the engine should start behaving differently (i.e. not start new iterations in the root, or not trying new moves in the root), and let the interrupt handler set a shared variable that defines the time-out severity, which could then be tested in any node.

When i program in C++, i like to make classes reusable, so the clsStdInThread is a generic class, that nothing have to do with chess. Of course i really don't know when and if i would use that class again...but that's enough to avoid setting some extra data on the std input thread class. The same is for the std output class and for the engine. Even Satana is a class, separated from the engine. In the future i could change the engine, without touching the Satana class or the other ones.

The logic that i follow here is:

- main() creates a Satana thread-object, then runs and joins that object
- Satana object creates std input thread, runs it but don't joins
- Satana object enters the loop, waiting for an input and/or a a move from the engine

Because the engine is stopped, at starts, the first thing received would an input from the user/WinBoard, of course. If the command require the engine, Satana starts the engine and continue the loop.

Satana has been created as a genetical software, so it have to be able to run multiple engines at the same time, each one with a different set of "evaluators". The idea was to grow genetically those evaluators and let the engines play various matches. The better engine has the better set of evaluators and Satana mix them to creates new sets.

This is just to explain why i have started using a different object for the program Satana and the engine (despite from reusability, that's important but is not the sole raison). Do to complication of the whole thing, keeping stdin/out objects separated from engine/Satana would keep connection between objects more simpler (i hope). The idea of evaluators doesn't give me any valid result, so i'm trying something else but the structure remained the same.

The suggested idea of the interrupt could be interesting, if just we can find a portable way to set an interrupt (almost between Win/Lin/Mac/iOs).

Maybe the idea of Marco Belli, to use a semaphore instead of the usleep, could be easily implemented. Your abortflag in fact is a sort of semaphore (i already have a bTimeExpired flag, in the engine). Still i should find a way to handle that semaphore in a portable, generic way. I could inherit from std input generic class a specialization for chess or add the concept of something like a call-back function that the object should call when some input occurs... but this will change the asynchronous nature of the class.

If this would not be just a chess software... all of this solution have to be deeply investigated.

hgm · Post by **hgm** » Sun Aug 10, 2014 8:58 pm

Well, I don't know C++, and from what you say this seems a good thing.

As to platform independence: this is hard to achieve when you interact with the system. #ifdef WIN32 directives work wonders, however.

stegemma · Post by **stegemma** » Mon Aug 18, 2014 12:52 pm

Moving from gcc to Visual C++ windows 64 bit i've found that the standard input thread gives an exception on program termination (this wouldn't happen using gcc or borland c++ compiler, windows 32 bit platform). This because std::cin.getline doesn't exits when the thread has been canceled and you should press enter to stop the console program. When you press enter, the class has already been destroyed and the exception occurs on string (and maybe other class objects) cleanup.

I've found no portable way to stop std::cin.getline but it seems to work this (very dirty) trick:

Code: Select all

clsStdInThread&#58;&#58;~clsStdInThread&#40;)
&#123;
   std&#58;&#58;cin.putback&#40;'\n');
   Cancel&#40;);
   Join&#40;);
&#125;

The putback insert back a newline in the stream, this let getline extract an empty string. The loop has been changed as follow:

Code: Select all

bool clsStdInThread&#58;&#58;Execute&#40;) 
&#123; 
   sfout.Push&#40;"# stdin thread is running"); 
   while&#40;state==enThreadRunning&#41; 
   &#123; 
      std&#58;&#58;string s; 
      std&#58;&#58;getline&#40;std&#58;&#58;cin, s&#41;; 
      pthread_testcancel&#40;);
      if&#40;s.length&#40;)) 
         Push&#40;s.c_str&#40;)); 
   &#125; 
   return true; 
&#125;

Anybody would like to try using similar code doesn't should find strange exception anymore.

Best way to handle input thread

Best way to handle input thread

Re: Best way to handle input thread

Re: Best way to handle input thread

Re: Best way to handle input thread

Re: Best way to handle input thread

Re: Best way to handle input thread

Re: Best way to handle input thread

Re: Best way to handle input thread

Re: Best way to handle input thread