What is absurd about it? It is a simple statement of fact. Apple HAS already computed the length of the source to check for overlap. It now has ALL it needs to pass to memmove() and make it work properly. No overhead beyond what it has already wasted to check for the overflow. So after investing that wasted time, why not get SOMETHING useful in return? Or is that just "absurd" in your book?bnemias wrote:I was just pointing out the absurdity of that particular quoted statement. It may be wasted to you, but I've read the thread and consider it rather amusing and therefore not wasted.bob wrote:You do realize that I said EXACTLY THAT? Since they chose to detect the overlap, why just abort rather than actually fix it? Simple enough now? I have only written that a dozen times. Seems silly to waste the CPU cycles to catch it, when it is not done in most programs, and then get NOTHING from those wasted cycles... If you'd join a thread by reading from the top, this wasted post would not be necessary.bnemias wrote:You do realize that to implement the solution you've been advocating, covering up the bug by invoking memmove(), that it is necessary to actually detect the overlap?bob wrote:It takes one more add and one more compare. How hard is that? I think it is doing something wrong by detecting the overlap in the first place.
strcpy() revisited
Moderators: hgm, Rebel, chrisw
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: strcpy() revisited
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: strcpy() revisited
Not in the code I saw posted. But now it is my turn to ask "they check AFTER doing something that has undefined behavior?" Something that could overrun the destination? Something that could cause who knows what? And we NEVER see their message?mvk wrote:I have reason to believe they execute the check after the copy. See my screenshot of XCode. If that's confirmed, the cost is near zero (branch prediction etc).bob wrote:What does it cost to walk down the string once? Twice? They have to walk it once to get the length, then again to copy. Close enough to 2x to use it as an approximation of the cost.
You have to pick one side of this thinking and stay there. They can't be sure their (broken) test will even be performed if the strings overlap.
Could be. Really doesn't matter, however. It is broken all the same. The compiler and lib ARE part of a single "package" that is used to produce an executable.So far that was only the case in your toy example. And that was explained: a gcc optimisation touched strcpy before apple got there. On non-toy examples, I get buffer errors in both directions.And what they did does not exactly "work". They only catch overlap in one direction, and NOT the worst one at that...
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: strcpy() revisited
I did not, of course...syzygy wrote:I guess you haven't heard of caches. Plus, just reading is considerably faster than reading and writing.bob wrote:What does it cost to walk down the string once? Twice? They have to walk it once to get the length, then again to copy. Close enough to 2x to use it as an approximation of the cost.syzygy wrote:No he said it is insane to cover up bugs instead of exposing them.bob wrote:So, to recap the position you must believe, "In order to compile more efficiently and be able to use tricky optimizations, it is perfectly OK to slow down strcpy() by a factor of 2."??? I wonder if those "wonderful optimizations" will offset that factor of 2? That is, I wonder if this is one of those "useful optimizations" that slows the code down rather than speeding it up?bnemias wrote:Yes, let's cover up bugs instead of expose them.... that is insanity.bob wrote:Seems perfectly reasonable to me. Somewhere. Some alternate universe. Where insanity reigns supreme... ... when they COULD have fixed both with a simple call to memmove().
There are two things here:
1) Apple pushing people to fix their bugs;
2) the good sides of strcpy() UB.
Ad 1)
The "ethics" of what Apple did may be debatable, but what they did does work. The extra check certainly does not slow down strcpy() by a factor of 2, that's just nonsense.
Anyway, if you compile with -D_FORTIFY_SOURCE, then you choose to have these checks.
Works everywhere on the planet tried to date, except for mavericks. This has been fixed so it works there too. But I do not choose to accept that kind of development nonsense. There I most certainly DO have a choice. There are enough other quirks in mavericks (poor processor scheduling, poor handling of hyper threading, I had already started looking into what was necessary. Turns out to be simple enough...You obviously don't care about anyone else that would like to compile and run your code with highly optimising compilers that do not respect "your intentions" for UB or that are not x86_64.As far as I understand, it only does not work on some constant strings that are not copied using the library implementation.And what they did does not exactly "work". They only catch overlap in one direction, and NOT the worst one at that...
Once I finish grading, yes. Does that help anyone but me, however?You were going to install Linux.
Here's an example. Type "w" or whatever you want on an os x box to get the load averages. Here is my macbook, where NOTHING is running:
scrappy% w
21:43 up 20 days, 4:30, 3 users, load averages: 0.77 0.82 0.82
Here is my office linux box, same conditions:
crafty% w
21:44:48 up 86 days, 8:57, 3 users, load average: 0.00, 0.01, 0.07
And no, it is not spotlight. Disabled. I have had zero luck getting this down to even 0.5, much less the 0.00/0.01 I am used to seeing on ANY unused linux box. Not acceptable. Not interested in wasting the time to track this nonsense down.
-
- Posts: 373
- Joined: Thu Aug 14, 2008 3:21 am
- Location: Albuquerque, NM
Re: strcpy() revisited
You think they're doing something wrong detecting the overlap. You also think they should invoke memmove() instead of doing strcpy(). Hint, it's funny.bob wrote:What is absurd about it? It is a simple statement of fact. Apple HAS already computed the length of the source to check for overlap. It now has ALL it needs to pass to memmove() and make it work properly. No overhead beyond what it has already wasted to check for the overflow. So after investing that wasted time, why not get SOMETHING useful in return? Or is that just "absurd" in your book?bnemias wrote:I was just pointing out the absurdity of that particular quoted statement. It may be wasted to you, but I've read the thread and consider it rather amusing and therefore not wasted.bob wrote:You do realize that I said EXACTLY THAT? Since they chose to detect the overlap, why just abort rather than actually fix it? Simple enough now? I have only written that a dozen times. Seems silly to waste the CPU cycles to catch it, when it is not done in most programs, and then get NOTHING from those wasted cycles... If you'd join a thread by reading from the top, this wasted post would not be necessary.bnemias wrote:You do realize that to implement the solution you've been advocating, covering up the bug by invoking memmove(), that it is necessary to actually detect the overlap?bob wrote:It takes one more add and one more compare. How hard is that? I think it is doing something wrong by detecting the overlap in the first place.
But you're right, it's a wasted post if you can't laugh at yourself.
-
- Posts: 838
- Joined: Thu Jul 05, 2007 5:03 pm
- Location: British Columbia, Canada
Re: strcpy() revisited
You must be using the word differently from us. "Semantics" are the meaning of your program. Since it is allegedly a C program, that means it is written using C syntax and its semantics are specified in the C language standard. And if that program invokes undefined behavior, the C language does not assign to it any semantics at all. Its not even guaranteed to execute the other parts correctly up to the undefined behavior; the entire execution is undefined.bob wrote:Do you know the definition of "semantics"? Apparently not.syzygy wrote:The C you supplied does not have semantics.bob wrote:My belief is still the same, the compiler should do "the right thing". I'm not sure what you mean by "assign semantics to my program." I supply the semantics, specifically, in the form of C source.
If you mean it is "hyatt C" and not C, then you should use a "hyatt C" compiler.
I know you aren't happy that the standard completely punts as soon as any UB is introduced into the execution, and thus its a crappy standard etc., but that's the language we've got. And large and useful applications are still written in it, and most of them work fine despite these annoying shortcomings of the language standard.
Sure, it could probably be improved, but even with 191+ flavors of undefined behavior in it, C is still one of the most useful programming languages around.
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: strcpy() revisited
How about stopping with trying to put words in my mouth. My EXACT statement, reduced to C-like syntaxbnemias wrote:You think they're doing something wrong detecting the overlap. You also think they should invoke memmove() instead of doing strcpy(). Hint, it's funny.bob wrote:What is absurd about it? It is a simple statement of fact. Apple HAS already computed the length of the source to check for overlap. It now has ALL it needs to pass to memmove() and make it work properly. No overhead beyond what it has already wasted to check for the overflow. So after investing that wasted time, why not get SOMETHING useful in return? Or is that just "absurd" in your book?bnemias wrote:I was just pointing out the absurdity of that particular quoted statement. It may be wasted to you, but I've read the thread and consider it rather amusing and therefore not wasted.bob wrote:You do realize that I said EXACTLY THAT? Since they chose to detect the overlap, why just abort rather than actually fix it? Simple enough now? I have only written that a dozen times. Seems silly to waste the CPU cycles to catch it, when it is not done in most programs, and then get NOTHING from those wasted cycles... If you'd join a thread by reading from the top, this wasted post would not be necessary.bnemias wrote:You do realize that to implement the solution you've been advocating, covering up the bug by invoking memmove(), that it is necessary to actually detect the overlap?bob wrote:It takes one more add and one more compare. How hard is that? I think it is doing something wrong by detecting the overlap in the first place.
But you're right, it's a wasted post if you can't laugh at yourself.
IF (they are going to check for overlap) {
check and call memmove() if detected;
else
just let strcpy() do its normal thing.
I did NOT suggest that they check for overlap and call memmove(). I suggested that they leave it alone, but since they HAD checked, they ought to at least use the result and improve things.
Nothing funny whatsoever IMHO.
-
- Posts: 589
- Joined: Tue Jun 04, 2013 10:15 pm
Re: strcpy() revisited
Wylie and Ronald have explained this several times by now: What the compiler does behind the door is its own business.bob wrote:But now it is my turn to ask "they check AFTER doing something that has undefined behavior?" Something that could overrun the destination? Something that could cause who knows what? And we NEVER see their message? You have to pick one side of this thinking and stay there. They can't be sure their (broken) test will even be performed if the strings overlap.
Would be good to file a bug report if you want it repaired for your example as well. I doubt they read this forum.Could be. Really doesn't matter, however. It is broken all the same. The compiler and lib ARE part of a single "package" that is used to produce an executable.
[Account deleted]
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: strcpy() revisited
What, EXACTLY, did I say. First read your statement above. "Semantics are the meaning of your program." I stated "I supply the semantics in the form of a C source."wgarvin wrote:You must be using the word differently from us. "Semantics" are the meaning of your program.bob wrote:Do you know the definition of "semantics"? Apparently not.syzygy wrote:The C you supplied does not have semantics.bob wrote:My belief is still the same, the compiler should do "the right thing". I'm not sure what you mean by "assign semantics to my program." I supply the semantics, specifically, in the form of C source.
If you mean it is "hyatt C" and not C, then you should use a "hyatt C" compiler.
I have learned ONE thing in this long discussion. The next time I find something odd that a specific compiler is doing, something that might affect a chess programmer and cause them to waste time looking for it, I will NOT be posting anything here. Let 'em waste the time just as I did. As opposed to my getting dragged into absolutely ridiculous conversations with "A professor supports doing things that cause undefined behavior" and such nonsense. Not again...
Since it is allegedly a C program, that means it is written using C syntax and its semantics are specified in the C language standard. And if that program invokes undefined behavior, the C language does not assign to it any semantics at all. Its not even guaranteed to execute the other parts correctly up to the undefined behavior; the entire execution is undefined.
And that is a bogus concept. Here's why.
Given a simple C statement, "i++; where i is a signed int"
A compiler could do any of the following:
1. just add 1 to i and move on.
2. add 1 to i and check the overflow flag and abort if set.
3. add 1 to i and check the overflow flag and do something completely unexpected if set, such as reverting i to its original state, setting it to zero, or anything else.
But in reality, it will ALWAYS do 1. Because it doesn't know whether it will overflow or not at compile time, and since it can't see the value of i at compile time, it just adds 1 and lets the hardware do its thing. Which is EXACTLY what I believe it should do. Just do what I asked.
It will ALWAYS do 1, UNLESS it can see the value of i, and realize that adding one will overflow. Because at compile time it might be able to do a sort of reverse constant propagation to see that i = a specific value when it gets here. NOW it behaves differently. It can just omit the operation completely. It can do something completely wrong in the case of if (a+1 > a).
Inconsistent. Even if someone knows the above possibilities, they might overlook the trickiness of the compiler and allow it to discern the value of i and wreck the code.
That is that part I disagree with. Since everyone constantly adds and such with signed integers, it is always possible that some of those adds will overflow. Yet the compiler just does the right thing. While if you happen to use a constant, it can wreck the code completely.
If you like that behavior, fine. I simply do not. As an old-school compiler person, that is NOT what we would want to produce from a compiler.
C is certainly useful, but this kind of nonsense is making it less useful. Because the compiler guys are taking great liberties with the term "undefined behavior". They catch some of it, ignore some of it, and completely break some of it. Consistency would be nice.
I know you aren't happy that the standard completely punts as soon as any UB is introduced into the execution, and thus its a crappy standard etc., but that's the language we've got. And large and useful applications are still written in it, and most of them work fine despite these annoying shortcomings of the language standard.
Sure, it could probably be improved, but even with 191+ flavors of undefined behavior in it, C is still one of the most useful programming languages around.
I don't want to get into a game of wits with the compiler...
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: strcpy() revisited
I didn't post it here for Apple's developers. I posted it here because it represented a change in behavior that has not been changed in 20 years, it was only on one system (Mavericks), it took a good bit of time to track down since it was not obvious that it was a mavericks issue at first. I thought it might benefit others that saw something similar and they could save some time. I won't make that mistake again, however.mvk wrote:Wylie and Ronald have explained this several times by now: What the compiler does behind the door is its own business.bob wrote:But now it is my turn to ask "they check AFTER doing something that has undefined behavior?" Something that could overrun the destination? Something that could cause who knows what? And we NEVER see their message? You have to pick one side of this thinking and stay there. They can't be sure their (broken) test will even be performed if the strings overlap.
Would be good to file a bug report if you want it repaired for your example as well. I doubt they read this forum.Could be. Really doesn't matter, however. It is broken all the same. The compiler and lib ARE part of a single "package" that is used to produce an executable.
-
- Posts: 838
- Joined: Thu Jul 05, 2007 5:03 pm
- Location: British Columbia, Canada
Re: strcpy() revisited
Yes, my statement was correct, but yours is not. You supply the C source. The C language standard supplies the semantics, and the C implementation (compiler, libraries, pthreads etc) may add some extensions of its own. Your C code means exactly what the compiler thinks it means. It doesn't mean what YOU think it means. Whenever what you think it means and what the language standard says it means disagree, what YOU think is WRONG.bob wrote:What, EXACTLY, did I say. First read your statement above. "Semantics are the meaning of your program." I stated "I supply the semantics in the form of a C source."wgarvin wrote:You must be using the word differently from us. "Semantics" are the meaning of your program.bob wrote:Do you know the definition of "semantics"? Apparently not.syzygy wrote:The C you supplied does not have semantics.
If you mean it is "hyatt C" and not C, then you should use a "hyatt C" compiler.
I don't know how I can possibly say this any clearer than I have. You are professor of CS who teaches C programming to others, and its just incomprehensible to me that you don't accept or apparently even understand this. I'm sorry to be the bearer of bad news here.
Note that it doesn't mean every C programmer has to memorize the standard, or even have read it before (although that might be worthwhile if they intend to work on anything serious written in C). Most C programmers probably haven't, and yet they are able to write correct or nearly-correct programs that work reasonably well in practice. But it does mean that if they labor under a misconception of how a language feature works, they will sometimes get burned. The compiler doesn't care what YOU think the code means, its "mind reading skills" are even worse than mine. It only cares about the semantics spelled out in the standard.
Too late! The moment you allowed UB to slip undetected into your program, you were at its mercy.bob wrote:wgarvin wrote: Since it is allegedly a C program, that means it is written using C syntax and its semantics are specified in the C language standard. And if that program invokes undefined behavior, the C language does not assign to it any semantics at all. Its not even guaranteed to execute the other parts correctly up to the undefined behavior; the entire execution is undefined.
And that is a bogus concept. Here's why.
[--some stuff snipped--]
Inconsistent. Even if someone knows the above possibilities, they might overlook the trickiness of the compiler and allow it to discern the value of i and wreck the code.
That is that part I disagree with. Since everyone constantly adds and such with signed integers, it is always possible that some of those adds will overflow. Yet the compiler just does the right thing. While if you happen to use a constant, it can wreck the code completely.
If you like that behavior, fine. I simply do not. As an old-school compiler person, that is NOT what we would want to produce from a compiler.
C is certainly useful, but this kind of nonsense is making it less useful. Because the compiler guys are taking great liberties with the term "undefined behavior". They catch some of it, ignore some of it, and completely break some of it. Consistency would be nice.
I know you aren't happy that the standard completely punts as soon as any UB is introduced into the execution, and thus its a crappy standard etc., but that's the language we've got. And large and useful applications are still written in it, and most of them work fine despite these annoying shortcomings of the language standard.
Sure, it could probably be improved, but even with 191+ flavors of undefined behavior in it, C is still one of the most useful programming languages around.
I don't want to get into a game of wits with the compiler...
But yeah, if you don't want to get into a "game of wits" with the compiler, then there's only one main thing you need to do, which is to be aware of the common types of UB and avoid them like the plague. Then you will have a legal C program with the semantics guaranteed by the standard, and everything will be fine. Crafty probably is such a program today, or nearly so.