Neither is atomic. The "atomic" component simply says that a single buffer will be written in its entirety. Doesn't say a thing about whether the data will overlap with other data and get scrambled, whether garbage bytes will be skipped over when writing, leaving junk in the file, etc.syzygy wrote:Fine with me if you want to equate printf() to write(), but for this discussion this is pretty useless.bob wrote:Actually printf() IS a write(). The write is just in the C library. That's one of the causes of the winboard protocol buffering issues most have as they tend to use printf() and scant() to write/read data. Only reasonable way to read is to use read() and bypass all but system buffering.
Writing to a Text File (Thread Safe)
Moderators: hgm, Rebel, chrisw
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: Writing to a Text File (Thread Safe)
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: Writing to a Text File (Thread Safe)
Don, this failed in the last 30 days. I modified the code to accept a fail high on a null-window search at the root, even though the resulting re-search failed low. Improved the Elo. I then decided to output the fail high because the out/logfile would show the original move, but not the replacement fail-high move. Corrupted my log file since I split at the root and two different threads could fail high on the null-window search.Don wrote:Have you done this test in the last 10 years? The documentation does say that printf is atomic so something does not quite wash. I have been burned many times by documentation though ....bob wrote:If you look at Crafty's utility.c file, you can see my "Print9) function. It uses printf() to send a message to the console, and fprintf() to send it to a file. I can guarantee you with 100% accuracy, fprintf() is absolutely NOT atomic. I have inadvertently done fprintf()s from more than one thread and ended up with corrupted log files. The problem is that each thread will access the descriptor, discover where the next byte to write is, and then write there. Then the byte offset gets updated. Sometimes the two writes overwrite the same area of the file which might leave part of the first write following the second one, if the first one was longer. But then updating the file position pointer skips too much leaving null/garbage bytes in the file.syzygy wrote:I don't think so.Don wrote:I think my understand is exactly the same as yours on this.syzygy wrote:It is not sufficient (and in fact not so important from the programmer's point of view) that the OS implements atomic writes. The programmer usually does not invoke the OS directly. The question is what the C-library (or e.g. the C# runtime in case of C#) guarantees.Don wrote:I think in linux a small write is atomic. There is an internal OS defined write buffer size (I think it's called PIPE_BUFFER_SIZE or something like that) which determines how much is written on one go. So if you had 100 different processes writing to the end of a text file I don't think they would get mixed together - but I'm not sure about the order they get written if that is an issue for you. I don't know how other OS's do it.
I believe the size of that is defined to be 512 and may be a POSIX standard.
If you use streams (FILE *) in C, then POSIX guarantees that stream operations are atomic (link). So if two threads each perform a single fprintf(), there is a guarantee that they will be executed sequentially, i.e. the program will not crash, and the two output strings will not be interleaved. However, the moment one thread uses two separate fprintf()s that need to stay together, you will need to lock.
No, the question is only whether you use one single printf() or multiple printf()s. Single printf()s are guaranteed to be executed atomically. This guarantee is provided by the (POSIX-compliant) C library. There is no 512 byte issue here. And it has really nothing to do with how the OS performs low level writes.Therefore, if you are doing simple logging with short lines less than 512 bytes - you do not need a critical section, just use printf they way you normally would. If you need to printf several lines that must stay grouped together for context you need to use a critical section even if the total you intend to write is less than 512 bytes.
I don't know where you got the "512" from...Otherwise printf lines from other threads my get placed between the lines you want grouped together. If you can group them together into a single string with a single printf and they total less than 512 you don't need a critical section - but you really need to be quite sure of that before printing and it's probably more trouble to check than then it's worth.
And there's no question this still happens ad when debugging some of the 23.6 changes, I was seeing corrupted log files. Solution was a simple lock before and unlock after the fprintf().
This was failing on my macbook (os/x) and on our cluster (different flavors of linux depending on the cluster) and also I saw it on ICC using the 8-core box I generally play on (another linux version)...
So yes, I have unintentionally tried this, and I have seen the resulting corrupted log file with garbage in the middle, and lines partially overwriting previous (shorter) lines...
You are TOTALLY misinterpreting the printf "atomic" comment. It guarantees that two different printf's will not interlace characters, specifically when displayed on the screen. It does NOT guarantee anything other than that. And when writing into a file, which fprintf() will do, things change.
The thing you overlook is that printf() has a buffer in the C library. The C library will occasionally flush and force that buffer to be written (say when you do a flush(), or print a "\n" character. But to flush the buffer, the library simply does a "write". Writing to a console window and writing to a file are completely different from the O/S perspective... There is no "file position pointer" in the console writing, while there is in the file writing.
An easy test. If you have a log file, have each thread just randomly write a recognizable message to the log. But make each thread write a different message of significantly different length. The look at the log.
Been there, done that, got the t-shirt.
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: Writing to a Text File (Thread Safe)
Nope, it was a problem found about a week before version 23.6 was released. And it failed on multiple different linux boxes + mac os/x mountain lion on my macbook. Trivial to fix, just acquire a lock, write to the file, release the lock.syzygy wrote:I have not done any testing myself, but from what I am reading fprintf() really is atomic on POSIX-compliant platforms at the level of the (FILE *) object.Don wrote:Have you done this test in the last 10 years? The documentation does say that printf is atomic so something does not quite wash. I have been burned many times by documentation though ....bob wrote:I can guarantee you with 100% accuracy, fprintf() is absolutely NOT atomic.
This means that if you fprintf() to the same (FILE *) object from multiple threads within a single process, characters from multiple output strings are not interleaved.
However, if multiple processes perform printf()s to private (FILE *) objects all wrapping the same shared file descriptor, then there is no atomicity guarantee. Each process is locking on its own (FILE *) object which is not going to prevent concurrent write()s to the shared file descriptor of partial lines.
So maybe Bob is recalling an experience when Crafty was using processes instead of threads.
I chose to fix it a different way by not having all threads trying to print at the same time... so I didn't ultimately end up with the lock I used when I first fixed it...
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: Writing to a Text File (Thread Safe)
Still are. Sharing same virtual address space, but different contexts so that regs and such can be saved separately.sje wrote:At one time, a Linux thread was really just a lightweight process. This may have changed.syzygy wrote:So maybe Bob is recalling an experience when Crafty was using processes instead of threads.
-
- Posts: 5566
- Joined: Tue Feb 28, 2012 11:56 pm
Re: Writing to a Text File (Thread Safe)
So.... can you explain why this code does what one would expect it to do:bob wrote:So yes, I have unintentionally tried this, and I have seen the resulting corrupted log file with garbage in the middle, and lines partially overwriting previous (shorter) lines...
You are TOTALLY misinterpreting the printf "atomic" comment. It guarantees that two different printf's will not interlace characters, specifically when displayed on the screen. It does NOT guarantee anything other than that. And when writing into a file, which fprintf() will do, things change.
Code: Select all
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
static pthread_attr_t thread_attr;
static pthread_t thread;
FILE *F;
int num;
void *worker(void *arg)
{
int i;
int n = num;
for (i = 1; i <= n; i++)
fprintf(F, "%d\n", i);
}
int main(int argc, char **argv)
{
F = fopen("bla.txt", "w");
if (argc == 2)
num = atoi(argv[1]);
else
num = 10000;
pthread_attr_init(&thread_attr);
pthread_attr_setdetachstate(&thread_attr, PTHREAD_CREATE_JOINABLE);
pthread_create(&thread, NULL, worker, NULL);
worker(NULL);
pthread_join(thread, NULL);
fclose(F);
return 0;
}
Code: Select all
$ gcc -pthread -O3 atomictest.c -o atomictest
$ ./atomictest 10000
$ sort -n bla.txt | md5sum
3d7fe033a73a69382345fbe46907194c -
The thing you overlook is that POSIX guarantees atomicity of fprintf(). This has nothing to do with write(). The C library is simply locking on FILE *F.The thing you overlook is that printf() has a buffer in the C library. The C library will occasionally flush and force that buffer to be written (say when you do a flush(), or print a "\n" character. But to flush the buffer, the library simply does a "write". Writing to a console window and writing to a file are completely different from the O/S perspective... There is no "file position pointer" in the console writing, while there is in the file writing.
-
- Posts: 5566
- Joined: Tue Feb 28, 2012 11:56 pm
Re: Writing to a Text File (Thread Safe)
The important thing is that the (FILE *) object is shared. This will normally be the case with threads (i.e. within the same address space) and not with processes (i.e. within separate address spaces).sje wrote:At one time, a Linux thread was really just a lightweight process. This may have changed.syzygy wrote:So maybe Bob is recalling an experience when Crafty was using processes instead of threads.
I suppose it is possible to have two different (FILE *) objects wrapping the same file descriptor even in a single thread, in which case the strings output by fprintf()s may end up in the file in reverse order. Or one could use both fprintf() and write() to write data to the same file descriptor, in which case the fprintf()s may be delayed with respect to the write()s (and may be interleaved by the write()s).
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: Writing to a Text File (Thread Safe)
What posix guarantees is that if printf() is used to write a block of characters, the characters will not be interlaced with other characters from other printf's...syzygy wrote:So.... can you explain why this code does what one would expect it to do:bob wrote:So yes, I have unintentionally tried this, and I have seen the resulting corrupted log file with garbage in the middle, and lines partially overwriting previous (shorter) lines...
You are TOTALLY misinterpreting the printf "atomic" comment. It guarantees that two different printf's will not interlace characters, specifically when displayed on the screen. It does NOT guarantee anything other than that. And when writing into a file, which fprintf() will do, things change.Code: Select all
#include <stdio.h> #include <stdlib.h> #include <pthread.h> static pthread_attr_t thread_attr; static pthread_t thread; FILE *F; int num; void *worker(void *arg) { int i; int n = num; for (i = 1; i <= n; i++) fprintf(F, "%d\n", i); } int main(int argc, char **argv) { F = fopen("bla.txt", "w"); if (argc == 2) num = atoi(argv[1]); else num = 10000; pthread_attr_init(&thread_attr); pthread_attr_setdetachstate(&thread_attr, PTHREAD_CREATE_JOINABLE); pthread_create(&thread, NULL, worker, NULL); worker(NULL); pthread_join(thread, NULL); fclose(F); return 0; }
The resulting bla.txt has interleaved lines, but none of the lines is corrupted.Code: Select all
$ gcc -pthread -O3 atomictest.c -o atomictest $ ./atomictest 10000 $ sort -n bla.txt | md5sum 3d7fe033a73a69382345fbe46907194c -
The thing you overlook is that POSIX guarantees atomicity of fprintf(). This has nothing to do with write(). The C library is simply locking on FILE *F.The thing you overlook is that printf() has a buffer in the C library. The C library will occasionally flush and force that buffer to be written (say when you do a flush(), or print a "\n" character. But to flush the buffer, the library simply does a "write". Writing to a console window and writing to a file are completely different from the O/S perspective... There is no "file position pointer" in the console writing, while there is in the file writing.
That's not the only issue. There are several ways one can get into difficulty. Ever used printf without a \n so that you print several things on one line. Broken to be sure. But my original post STILL stands. Current Crafty with an extra printf here and there corrupts the log file in random places. I'll be happy to insert a printf() at the fail-high point I was talking about and post the logfile if you want to verify. A toy program with nothing but prints is REALLY a poor test of anything related to parallel programming.
If you believe that proves one can do printf's and fprintf's everywhere with no regard to locks, feel free to continue. It WILL bite you.
-
- Posts: 5566
- Joined: Tue Feb 28, 2012 11:56 pm
Re: Writing to a Text File (Thread Safe)
Exactly my point all along. Now explain how I was "TOTALLY misinterpreting the printf "atomic" comment"...bob wrote:What posix guarantees is that if printf() is used to write a block of characters, the characters will not be interlaced with other characters from other printf's...syzygy wrote:So.... can you explain why this code does what one would expect it to do:bob wrote:So yes, I have unintentionally tried this, and I have seen the resulting corrupted log file with garbage in the middle, and lines partially overwriting previous (shorter) lines...
You are TOTALLY misinterpreting the printf "atomic" comment. It guarantees that two different printf's will not interlace characters, specifically when displayed on the screen. It does NOT guarantee anything other than that. And when writing into a file, which fprintf() will do, things change.Code: Select all
#include <stdio.h> #include <stdlib.h> #include <pthread.h> static pthread_attr_t thread_attr; static pthread_t thread; FILE *F; int num; void *worker(void *arg) { int i; int n = num; for (i = 1; i <= n; i++) fprintf(F, "%d\n", i); } int main(int argc, char **argv) { F = fopen("bla.txt", "w"); if (argc == 2) num = atoi(argv[1]); else num = 10000; pthread_attr_init(&thread_attr); pthread_attr_setdetachstate(&thread_attr, PTHREAD_CREATE_JOINABLE); pthread_create(&thread, NULL, worker, NULL); worker(NULL); pthread_join(thread, NULL); fclose(F); return 0; }
The resulting bla.txt has interleaved lines, but none of the lines is corrupted.Code: Select all
$ gcc -pthread -O3 atomictest.c -o atomictest $ ./atomictest 10000 $ sort -n bla.txt | md5sum 3d7fe033a73a69382345fbe46907194c -
The thing you overlook is that POSIX guarantees atomicity of fprintf(). This has nothing to do with write(). The C library is simply locking on FILE *F.The thing you overlook is that printf() has a buffer in the C library. The C library will occasionally flush and force that buffer to be written (say when you do a flush(), or print a "\n" character. But to flush the buffer, the library simply does a "write". Writing to a console window and writing to a file are completely different from the O/S perspective... There is no "file position pointer" in the console writing, while there is in the file writing.
So it seems this is NOT specifically when displayed on the screen, and things DO NOT change when writing into a file, right?bob wrote:It guarantees that two different printf's will not interlace characters, specifically when displayed on the screen. It does NOT guarantee anything other than that. And when writing into a file, which fprintf() will do, things change.
As I wrote all along, scroll up. Or click here. Or just read:That's not the only issue.
On POSIX with multiple threads fprintf()ing to the same (FILE *), there is simply no corruption very much unlike what you were claiming before you started your retreat with your last post.syzygy wrote:If you use streams (FILE *) in C, then POSIX guarantees that stream operations are atomic (link). So if two threads each perform a single fprintf(), there is a guarantee that they will be executed sequentially, i.e. the program will not crash, and the two output strings will not be interleaved. However, the moment one thread uses two separate fprintf()s that need to stay together, you will need to lock.
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: Writing to a Text File (Thread Safe)
I will say this again. If I let crafty write to a log file from different threads at the same time, it becomes corrupted. If I protect that output (as I do now) by first acquiring a lock, the log files remain perfectly normal.syzygy wrote:Exactly my point all along. Now explain how I was "TOTALLY misinterpreting the printf "atomic" comment"...bob wrote:What posix guarantees is that if printf() is used to write a block of characters, the characters will not be interlaced with other characters from other printf's...syzygy wrote:So.... can you explain why this code does what one would expect it to do:bob wrote:So yes, I have unintentionally tried this, and I have seen the resulting corrupted log file with garbage in the middle, and lines partially overwriting previous (shorter) lines...
You are TOTALLY misinterpreting the printf "atomic" comment. It guarantees that two different printf's will not interlace characters, specifically when displayed on the screen. It does NOT guarantee anything other than that. And when writing into a file, which fprintf() will do, things change.Code: Select all
#include <stdio.h> #include <stdlib.h> #include <pthread.h> static pthread_attr_t thread_attr; static pthread_t thread; FILE *F; int num; void *worker(void *arg) { int i; int n = num; for (i = 1; i <= n; i++) fprintf(F, "%d\n", i); } int main(int argc, char **argv) { F = fopen("bla.txt", "w"); if (argc == 2) num = atoi(argv[1]); else num = 10000; pthread_attr_init(&thread_attr); pthread_attr_setdetachstate(&thread_attr, PTHREAD_CREATE_JOINABLE); pthread_create(&thread, NULL, worker, NULL); worker(NULL); pthread_join(thread, NULL); fclose(F); return 0; }
The resulting bla.txt has interleaved lines, but none of the lines is corrupted.Code: Select all
$ gcc -pthread -O3 atomictest.c -o atomictest $ ./atomictest 10000 $ sort -n bla.txt | md5sum 3d7fe033a73a69382345fbe46907194c -
The thing you overlook is that POSIX guarantees atomicity of fprintf(). This has nothing to do with write(). The C library is simply locking on FILE *F.The thing you overlook is that printf() has a buffer in the C library. The C library will occasionally flush and force that buffer to be written (say when you do a flush(), or print a "\n" character. But to flush the buffer, the library simply does a "write". Writing to a console window and writing to a file are completely different from the O/S perspective... There is no "file position pointer" in the console writing, while there is in the file writing.
So it seems this is NOT specifically when displayed on the screen, and things DO NOT change when writing into a file, right?bob wrote:It guarantees that two different printf's will not interlace characters, specifically when displayed on the screen. It does NOT guarantee anything other than that. And when writing into a file, which fprintf() will do, things change.
As I wrote all along, scroll up. Or click here. Or just read:That's not the only issue.On POSIX with multiple threads fprintf()ing to the same (FILE *), there is simply no corruption very much unlike what you were claiming before you started your retreat with your last post.syzygy wrote:If you use streams (FILE *) in C, then POSIX guarantees that stream operations are atomic (link). So if two threads each perform a single fprintf(), there is a guarantee that they will be executed sequentially, i.e. the program will not crash, and the two output strings will not be interleaved. However, the moment one thread uses two separate fprintf()s that need to stay together, you will need to lock.
The problem seems to not be the actual characters being written, it seems that fprintf() updates the file position pointer outside the lock. If thread A writes 120 bytes, then thread b writes 80 bytes, every now and then the file pointer comes out wrong. I didn't try to debug the C library as I simply didn't care, I had to live with it and made my code work with it.
If you don't believe it will fail, carry on. You MIGHT get lucky and not see it. I KNOW it will fail because I found the corrupt logs. And I initially assumed I was writing garbage somewhere. I wasn't...
-
- Posts: 1564
- Joined: Thu Jul 16, 2009 10:47 am
- Location: Almere, The Netherlands
Re: Writing to a Text File (Thread Safe)
If you use threads and you only want to use your program at Windows 7 and above, you can also use slim reader/writer locks SRWLOCK, this is a little bit more efficient than critical sections.mar wrote:If you use threads, use CRITICAL_SECTION (equivalent to pthread_mutex_t). If you use processes, use CreateMutex.