Writing to a Text File (Thread Safe)

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

mvk
Posts: 589
Joined: Tue Jun 04, 2013 10:15 pm

Re: Writing to a Text File (Thread Safe)

Post by mvk »

Steve Maughan wrote:Hi All,

Thanks for the answers. It would seem I need to "lock" the section of code. How would I do this in Windows? I have read about CRITICALSECTIONS and Windows Mutex - not sure if one is more suitable than the other.
Before you solve it, you might want to convince yourself that there is a problem: stress test and trigger a fault. (How else will you test the solution works)
User avatar
Steve Maughan
Posts: 1221
Joined: Wed Mar 08, 2006 8:28 pm
Location: Florida, USA

Re: Writing to a Text File (Thread Safe)

Post by Steve Maughan »

Hi Marcel,

Wise words indeed. I found the bug which had nothing to do with the locked sections!

Thanks,

Steve
http://www.chessprogramming.net - Maverick Chess Engine
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Writing to a Text File (Thread Safe)

Post by bob »

Steve Maughan wrote:In Maverick I'm writing a log of all the input and out put to a text file.

Since Maverick uses two thread - one to read the input and one to "
think" there is a possibility it tries to write to the same file at the same time. Is this a problem? How do others get around it?

Thanks,

Steve
You simply should not do it. If you MUST, you need to define and acquire a lock before doing any I/O to a file that can be written to in more than one thread. Then release the lock after the write. Better to simply not do that and only write from one thread.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Writing to a Text File (Thread Safe)

Post by bob »

MrEdCollins wrote:Can you toggle a switch that both threads read?

When either thread wishes to write to the file, if first checks this okay_to_write switch. If this switch is set to 1 (yes) it temporarily sets the switch to 0 (no) and goes ahead and writes to the file. It then sets the switch back to 1 after the write is complete.

If either thread checks the switch and discovers that it is set to 0, this thread now knows the other thread must have just set the switch to 0. So now this thread just waits, and keeps checking the switch until it finds that it's back to 1 again.

Once it does find it back to 1, it works as described above... now THIS thread temporarily sets the switch to 0, writes to the file, and then sets it back to 1 when complete.
Will not work. Known race condition. Suppose both test the switch at exactly the same time? Both get a 1, both set it to zero, both write to the file, both reset it back to one, log file is corrupted. A normal pthread mutex will fix this.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Writing to a Text File (Thread Safe)

Post by bob »

Don wrote:
Steve Maughan wrote:In Maverick I'm writing a log of all the input and out put to a text file.

Since Maverick uses two thread - one to read the input and one to "
think" there is a possibility it tries to write to the same file at the same time. Is this a problem? How do others get around it?

Thanks,

Steve
I think in linux a small write is atomic. There is an internal OS defined write buffer size (I think it's called PIPE_BUFFER_SIZE or something like that) which determines how much is written on one go. So if you had 100 different processes writing to the end of a text file I don't think they would get mixed together - but I'm not sure about the order they get written if that is an issue for you. I don't know how other OS's do it.

I believe the size of that is defined to be 512 and may be a POSIX standard.

Don
You should try this. Normal file I/O doesn't use a pipe anywhere. And it will corrupt the file everywhere. I unintentionally added this bug to crafty a while back when I was doing the "always accept a fail high" code. Multiple threads could fail high on different moves, and write to the log before backing out and restarting the search with a wider window.

Instant corruption. Best solution is to only write to a single file from one thread. Otherwise, acquire a lock first.
syzygy
Posts: 5563
Joined: Tue Feb 28, 2012 11:56 pm

Re: Writing to a Text File (Thread Safe)

Post by syzygy »

mvk wrote:
Steve Maughan wrote:Hi All,

Thanks for the answers. It would seem I need to "lock" the section of code. How would I do this in Windows? I have read about CRITICALSECTIONS and Windows Mutex - not sure if one is more suitable than the other.
Before you solve it, you might want to convince yourself that there is a problem: stress test and trigger a fault. (How else will you test the solution works)
If by design multiple threads are allowed to write to the same file, then I don't see why a test would be needed. Maybe it is really hard to trigger this case, maybe it will even be impossible in practice to trigger the bug on the OP's system, but that does not mean known race conditions are acceptable.
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: Writing to a Text File (Thread Safe)

Post by Don »

syzygy wrote:
Don wrote:I think in linux a small write is atomic. There is an internal OS defined write buffer size (I think it's called PIPE_BUFFER_SIZE or something like that) which determines how much is written on one go. So if you had 100 different processes writing to the end of a text file I don't think they would get mixed together - but I'm not sure about the order they get written if that is an issue for you. I don't know how other OS's do it.

I believe the size of that is defined to be 512 and may be a POSIX standard.
It is not sufficient (and in fact not so important from the programmer's point of view) that the OS implements atomic writes. The programmer usually does not invoke the OS directly. The question is what the C-library (or e.g. the C# runtime in case of C#) guarantees.

If you use streams (FILE *) in C, then POSIX guarantees that stream operations are atomic (link). So if two threads each perform a single fprintf(), there is a guarantee that they will be executed sequentially, i.e. the program will not crash, and the two output strings will not be interleaved. However, the moment one thread uses two separate fprintf()s that need to stay together, you will need to lock.
I think my understand is exactly the same as yours on this.

Therefore, if you are doing simple logging with short lines less than 512 bytes - you do not need a critical section, just use printf they way you normally would. If you need to printf several lines that must stay grouped together for context you need to use a critical section even if the total you intend to write is less than 512 bytes. Otherwise printf lines from other threads my get placed between the lines you want grouped together. If you can group them together into a single string with a single printf and they total less than 512 you don't need a critical section - but you really need to be quite sure of that before printing and it's probably more trouble to check than then it's worth. Just use a critical section.

To summarize I would say that if you are doing simple logging and do not need more than 1 line of context and have complete control over the length of the log lines then you don't need a critical section, even if other threads are running. In a way you would be imposing extra overhead for no reason since the OS is already doing this for you. Otherwise, it's simpler to make a critical section to keep yourself out of trouble.

On the other hand if you want your application to be totally cross-platform, you should not use efficiency shortcuts like this. I don't know what the behavior in Windows is for this.

It seems streams in C++ before C++11 were not thread-safe, i.e. concurrent stream access could crash the program. C++11 guarantees the program will not crash, but characters may be interleaved (i.e. concurrent cout << "ab" and cout << "cd" may produce "acdb".)
Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.
syzygy
Posts: 5563
Joined: Tue Feb 28, 2012 11:56 pm

Re: Writing to a Text File (Thread Safe)

Post by syzygy »

Don wrote:
syzygy wrote:
Don wrote:I think in linux a small write is atomic. There is an internal OS defined write buffer size (I think it's called PIPE_BUFFER_SIZE or something like that) which determines how much is written on one go. So if you had 100 different processes writing to the end of a text file I don't think they would get mixed together - but I'm not sure about the order they get written if that is an issue for you. I don't know how other OS's do it.

I believe the size of that is defined to be 512 and may be a POSIX standard.
It is not sufficient (and in fact not so important from the programmer's point of view) that the OS implements atomic writes. The programmer usually does not invoke the OS directly. The question is what the C-library (or e.g. the C# runtime in case of C#) guarantees.

If you use streams (FILE *) in C, then POSIX guarantees that stream operations are atomic (link). So if two threads each perform a single fprintf(), there is a guarantee that they will be executed sequentially, i.e. the program will not crash, and the two output strings will not be interleaved. However, the moment one thread uses two separate fprintf()s that need to stay together, you will need to lock.
I think my understand is exactly the same as yours on this.
I don't think so.
Therefore, if you are doing simple logging with short lines less than 512 bytes - you do not need a critical section, just use printf they way you normally would. If you need to printf several lines that must stay grouped together for context you need to use a critical section even if the total you intend to write is less than 512 bytes.
No, the question is only whether you use one single printf() or multiple printf()s. Single printf()s are guaranteed to be executed atomically. This guarantee is provided by the (POSIX-compliant) C library. There is no 512 byte issue here. And it has really nothing to do with how the OS performs low level writes.
Otherwise printf lines from other threads my get placed between the lines you want grouped together. If you can group them together into a single string with a single printf and they total less than 512 you don't need a critical section - but you really need to be quite sure of that before printing and it's probably more trouble to check than then it's worth.
I don't know where you got the "512" from...
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: Writing to a Text File (Thread Safe)

Post by Don »

syzygy wrote:
Don wrote:
syzygy wrote:
Don wrote:I think in linux a small write is atomic. There is an internal OS defined write buffer size (I think it's called PIPE_BUFFER_SIZE or something like that) which determines how much is written on one go. So if you had 100 different processes writing to the end of a text file I don't think they would get mixed together - but I'm not sure about the order they get written if that is an issue for you. I don't know how other OS's do it.

I believe the size of that is defined to be 512 and may be a POSIX standard.
It is not sufficient (and in fact not so important from the programmer's point of view) that the OS implements atomic writes. The programmer usually does not invoke the OS directly. The question is what the C-library (or e.g. the C# runtime in case of C#) guarantees.

If you use streams (FILE *) in C, then POSIX guarantees that stream operations are atomic (link). So if two threads each perform a single fprintf(), there is a guarantee that they will be executed sequentially, i.e. the program will not crash, and the two output strings will not be interleaved. However, the moment one thread uses two separate fprintf()s that need to stay together, you will need to lock.
I think my understand is exactly the same as yours on this.
I don't think so.
Therefore, if you are doing simple logging with short lines less than 512 bytes - you do not need a critical section, just use printf they way you normally would. If you need to printf several lines that must stay grouped together for context you need to use a critical section even if the total you intend to write is less than 512 bytes.
No, the question is only whether you use one single printf() or multiple printf()s. Single printf()s are guaranteed to be executed atomically. This guarantee is provided by the (POSIX-compliant) C library. There is no 512 byte issue here. And it has really nothing to do with how the OS performs low level writes.
Otherwise printf lines from other threads my get placed between the lines you want grouped together. If you can group them together into a single string with a single printf and they total less than 512 you don't need a critical section - but you really need to be quite sure of that before printing and it's probably more trouble to check than then it's worth.
I don't know where you got the "512" from...
It turns out the 512 PIPE_BUF does not apply to (FILE *) objects just as you say.

So the simple answer for the original poster is that it's ok to use printf freely if you don't need to group multiple prints together for any reason.

Code: Select all

All functions that reference &#40;FILE *) objects shall behave as if they use flockfile&#40;) and funlockfile&#40;) internally to obtain ownership of these ( FILE *) objects.

You asked me where I got the 512 from. I found it from the PIPE man page, but presumably the posix standard does not impose this limit on printf statements for conforming C libraries. This applies to ALL writes, not just printf:

Code: Select all

PIPE_BUF

POSIX.1-2001 says that write&#40;2&#41;s of less than PIPE_BUF bytes must be atomic&#58; the output data is written to the pipe as a contiguous sequence. Writes of more than PIPE_BUF bytes may be nonatomic&#58; the kernel may interleave the data with data written by other processes. POSIX.1-2001 requires PIPE_BUF to be at least 512 bytes. &#40;On Linux, PIPE_BUF is 4096 bytes.) 
Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.
syzygy
Posts: 5563
Joined: Tue Feb 28, 2012 11:56 pm

Re: Writing to a Text File (Thread Safe)

Post by syzygy »

Don wrote:You asked me where I got the 512 from. I found it from the PIPE man page, but presumably the posix standard does not impose this limit on printf statements for conforming C libraries. This applies to ALL writes, not just printf:

Code: Select all

PIPE_BUF

POSIX.1-2001 says that write&#40;2&#41;s of less than PIPE_BUF bytes must be atomic&#58; the output data is written to the pipe as a contiguous sequence. Writes of more than PIPE_BUF bytes may be nonatomic&#58; the kernel may interleave the data with data written by other processes. POSIX.1-2001 requires PIPE_BUF to be at least 512 bytes. &#40;On Linux, PIPE_BUF is 4096 bytes.) 
Ok, that explains the 512. But note that this only applies to write()s to pipes and not to printf() (which is neither a write(), nor outputs to a pipe). Once you use (FILE *) objects, I/O is buffered by the C library and the connection with write() is lost. Conceivably, printf() could insert data into the buffer character by character interleaved with characters written by other threads. Or printf() could issue a separate write() for each 10 characters. It is only the atomicity guarantee for printf() mandated by POSIX that prevents this.

It might still be wise to use locks and not depend on printf() being atomic if you want to run on Windows.