Compiler Problem

wgarvin · Post by **wgarvin** » Fri Jul 02, 2010 3:28 pm

bob wrote:
rbarreira wrote:
bob wrote: I'd still suspect local memory accesses as the most likely candidate, particularly for a pretty mature compiler like MSVC.
This year I already found one bug in gcc and one in icc... there are new ones reported every day.
Correct, but only one out of every 1,000 "reported compiler bugs" is a real compiler bug...

Its probably a lot higher than that, at least 1 in 20. I know I'm not going to put together a repro sample and send in a compiler bug report until I'm solidly convinced that its a bug in the compiler rather than a bug in my own code. I'm sure they get plenty of bug reports which turn out to be user error, but with the complexity of modern optimizing compilers, its not THAT surprising to find bugs in them. In a way, they're like chess engines -- a bug might only manifest itself when there is a strange and unusual interaction of different states in the compiler. I found a loop optimizer bug in the frontend of the MS compiler which miscompiled a slightly unusual loop condition into an infinite loop. But by adding if (xx) break; in the loop, I changed the control flow enough that it was then able to compile it correctly.

The repro case we had was able to reproduce it on the well-tested x86 compiler as well, and it was probably in there for years before anybody noticed, despite that compiler probably being used to compile hundreds of millions of lines of code every day.

Another eye-opening example is the volatile qualifier. Most production compilers miscompile volatile accesses in at least some situations. Basic use cases usually work, but in some compilers, even simple optimizations like loop invariant code motion can break the semantics of volatile. Anyone relying on volatile in an embedded system or any safety-critical system, would do well to wrap volatile accesses in function calls and inspect all of the generated code.

bob · Post by **bob** » Fri Jul 02, 2010 10:33 pm

wgarvin wrote:
bob wrote:
rbarreira wrote:
bob wrote: I'd still suspect local memory accesses as the most likely candidate, particularly for a pretty mature compiler like MSVC.
This year I already found one bug in gcc and one in icc... there are new ones reported every day.
Correct, but only one out of every 1,000 "reported compiler bugs" is a real compiler bug...
Its probably a lot higher than that, at least 1 in 20. I know I'm not going to put together a repro sample and send in a compiler bug report until I'm solidly convinced that its a bug in the compiler rather than a bug in my own code. I'm sure they get plenty of bug reports which turn out to be user error, but with the complexity of modern optimizing compilers, its not THAT surprising to find bugs in them. In a way, they're like chess engines -- a bug might only manifest itself when there is a strange and unusual interaction of different states in the compiler. I found a loop optimizer bug in the frontend of the MS compiler which miscompiled a slightly unusual loop condition into an infinite loop. But by adding if (xx) break; in the loop, I changed the control flow enough that it was then able to compile it correctly.

The repro case we had was able to reproduce it on the well-tested x86 compiler as well, and it was probably in there for years before anybody noticed, despite that compiler probably being used to compile hundreds of millions of lines of code every day.

Another eye-opening example is the volatile qualifier. Most production compilers miscompile volatile accesses in at least some situations. Basic use cases usually work, but in some compilers, even simple optimizations like loop invariant code motion can break the semantics of volatile. Anyone relying on volatile in an embedded system or any safety-critical system, would do well to wrap volatile accesses in function calls and inspect all of the generated code.

I've found my fair share of gcc bugs. But I used to be on the mailing list and in the usenet gcc group, and there are _many_ morons that claim compiler bug at the drop of a hat. Had a couple of friends that worked on gcc in the early 90's, and they were really incensed by the bogus claims and the time they wasted.

Sven · Post by **Sven** » Mon Jul 12, 2010 3:08 pm

LoopList wrote:In which line of the following code do you see the problem or bug? I simplified the code a bit.

Code: Select all

#include <stdio.h> 

const int IS_ENPASSANT = 1 << 15; 

struct move_info_c &#123; int move, value; &#125;; 
struct board_c &#123; int piece_on&#91;64&#93;; &#125; B; 

inline int move_to&#40;int move&#41; &#123; return move & 63; &#125; 
inline int move_from&#40;int move&#41; &#123; return &#40;move >> 6&#41; & 63; &#125; 
inline int move_is_enpassant&#40;int move&#41; &#123; return &#40;move & IS_ENPASSANT&#41; != 0; &#125; 
inline int make_enpassant&#40;int from, int to&#41; &#123; return to | &#40;from << 6&#41; | IS_ENPASSANT; &#125; 
inline int piece_on_square&#40;int square&#41; &#123; return B.piece_on&#91;square&#93;; &#125; 
inline int move_is_capture&#40;int move&#41; &#123; return &#40;piece_on_square&#40;move_to&#40;move&#41;) != 6&#41; || move_is_enpassant&#40;move&#41;; &#125; 
inline int move_piece&#40;int move&#41; &#123; return piece_on_square&#40;move_from&#40;move&#41;); &#125; 
inline int move_piece_captured&#40;int move&#41; &#123; return move_is_enpassant&#40;move&#41; ? 0 &#58; piece_on_square&#40;move_to&#40;move&#41;); &#125; 

inline int piece_value_midgame&#40;int piece&#41; &#123; 
    int piece_value_mg&#91;7&#93; = &#123;100, 300, 400, 500, 900, 1000, 0&#125;; 
    return piece_value_mg&#91;piece&#93;; 
&#125; 

void initialize&#40;) &#123; 
    for &#40;int square = 0; square < 64; square++) B.piece_on&#91;square&#93; = 6; 
    B.piece_on&#91;28&#93; = B.piece_on&#91;29&#93; = 0; 
&#125; 

class sort_c &#123; 
public&#58; 
    sort_c&#40;); 
    move_info_c * last_move, move_list&#91;256&#93;; 
&#125;; 

sort_c&#58;&#58;sort_c&#40;) &#123; 
    move_list&#91;0&#93;.move = make_enpassant&#40;28, 21&#41;; 
    move_info_c * current = move_list; 
    int move = current->move, value = 0, temp = 0; 

    if &#40;value < 0&#41; current->value = value; 
    else if &#40;move_is_capture&#40;move&#41;) &#123; 
      temp = 50000 + piece_value_midgame&#40;move_piece_captured&#40;move&#41;) - move_piece&#40;move&#41;; 
      current->value = temp; 
    &#125; 
    else current->value = 0; 

    printf_s&#40;"value...........%d\n", current->value&#41;; 
    printf_s&#40;"temp............%d\n", temp&#41;; 
&#125; 

int main&#40;) &#123; 
  initialize&#40;); 
  sort_c s; 
  return 0; 
&#125;

The following is essentially the same program as your already reduced version above:

Code: Select all

#include <stdio.h>
#include <stdlib.h>


// change the following line into "typedef int uint;" if you are sure that
// the problem is not related to signedness
typedef unsigned int uint;

// change the following line into "typedef int MYBOOL;" if you are sure that
// the problem is not related to using int instead of bool
typedef bool MYBOOL;


void assertionFailure&#40;char const * expr, char const * file, int line&#41; &#123;
    fprintf&#40;stderr, "ASSERTION FAILED&#58; "%s" (%s, line %d&#41;\n", file, line&#41;;
    exit&#40;1&#41;;
&#125;
#define ASSERT&#40;expr&#41; if (!&#40;expr&#41;) assertionFailure&#40;#expr, __FILE__, __LINE__)

const int IS_ENPASSANT = 1 << 15;

struct board_c &#123; uint piece_on&#91;64&#93;; &#125; B;

inline uint move_to&#40;int move&#41; &#123;
    return move & 63;
&#125;
inline uint move_from&#40;int move&#41; &#123;
    return &#40;move >> 6&#41; & 63;
&#125;
inline MYBOOL move_is_enpassant&#40;int move&#41; &#123;
    return &#40;move & IS_ENPASSANT&#41; != 0;
&#125;
inline int make_enpassant&#40;uint from, uint to&#41; &#123;
    return to | &#40;from << 6&#41; | IS_ENPASSANT;
&#125;
inline uint piece_on_square&#40;uint square&#41; &#123;
    ASSERT&#40;square < 64&#41;;
    return B.piece_on&#91;square&#93;;
&#125;
inline MYBOOL move_is_capture&#40;int move&#41; &#123;
    return &#40;piece_on_square&#40;move_to&#40;move&#41;) != 6&#41; || move_is_enpassant&#40;move&#41;;
&#125;
inline uint move_piece&#40;int move&#41; &#123;
    return piece_on_square&#40;move_from&#40;move&#41;);
&#125;
inline uint move_piece_captured&#40;int move&#41; &#123;
    return move_is_enpassant&#40;move&#41; ? 0 &#58; piece_on_square&#40;move_to&#40;move&#41;);
&#125;
inline int piece_value_midgame&#40;uint piece&#41; &#123;
    ASSERT&#40;piece < 7&#41;;
    int piece_value_mg&#91;7&#93; = &#123;100, 300, 400, 500, 900, 1000, 0&#125;;
    return piece_value_mg&#91;piece&#93;;
&#125;

int main&#40;) &#123;
    for &#40;uint square = 0; square < 64; square++) B.piece_on&#91;square&#93; = 6;
    B.piece_on&#91;28&#93; = B.piece_on&#91;29&#93; = 0;

    int move = make_enpassant&#40;28, 21&#41;;
    ASSERT&#40;move_is_enpassant&#40;move&#41;);

    int temp = 0;

    if &#40;move_is_capture&#40;move&#41;) &#123;
        temp = 50000
             + piece_value_midgame&#40;move_piece_captured&#40;move&#41;)
             - move_piece&#40;move&#41;;
    &#125;

    printf_s&#40;"temp............%d\n", temp&#41;;

    // pause, just to be able to see the output when running the program with "F5" in VisualStudio ...
    printf&#40;"press return ...\n");
    char line&#91;80&#93;;
    &#40;void&#41; fgets&#40;line, sizeof&#40;line&#41;, stdin&#41;;

    return 0;
&#125;

but with the following changes:

1. Removed the move_info_c structure, the move_list, and all code related to these data structures, assuming that the faulty behaviour is not related to that aspect but does already occur with a simple "int move" and no move list around it.
2. Removed other unused code: int value = 0, if (value < 0).
3. Removed class sort_c, putting the essential code directly into main().
4. Added few ASSERT's (here also used in release version).
5. Introduced usage of unsigned int and bool, in order to verify that the problem is not related to these aspects (you can roll back both easily, see code comments)

Does this program version still expose the problem with VS2010/x64 and the options you described? If it does: is there an ASSERT thrown, or just a wrong value printed? If it does not, you can try to add potentially "offending" parts of the program that have been removed so far, step by step, until the problem pops up again. Of special interest would be whether my assumption under no. 1 above holds.

At the moment I do not see any wrong implementation here, too, so a compiler bug would be the logical conclusion. But one point creates some doubt: why should that compiler bug pop up just for some chess program code using "en passant", where each chess programmer has already had a couple of ep bugs in his chess programmer lifetime?

Sven

jdart · Post by **jdart** » Wed Jul 14, 2010 6:31 pm

> While a compiler bug is always possible, it's also quite unlikely, less likely than a subtle bug.

I worked for a compiler company in the late 1990s. They sold expensive products that were extensively tested. The compiler was self-hosting and could compile itself, as well as pass test suites. However, at any given time it had over a thousand open bugs, some of them serious.

So it is in fact very common for compilers to have bugs.

The problem with testing compilers is that the set of inputs is infinite. You can test it with a lot of different inputs but that doesn't mean you've exercised every distinct path through the compiler code. Every program triggers different sequences of actions inside the compiler. Once in a while some of those actions generate incorrect code.

wgarvin · Post by **wgarvin** » Thu Jul 15, 2010 5:13 am

jdart wrote:> While a compiler bug is always possible, it's also quite unlikely, less likely than a subtle bug.

I worked for a compiler company in the late 1990s. They sold expensive products that were extensively tested. The compiler was self-hosting and could compile itself, as well as pass test suites. However, at any given time it had over a thousand open bugs, some of them serious.

So it is in fact very common for compilers to have bugs.

The problem with testing compilers is that the set of inputs is infinite. You can test it with a lot of different inputs but that doesn't mean you've exercised every distinct path through the compiler code. Every program triggers different sequences of actions inside the compiler. Once in a while some of those actions generate incorrect code.

I think it is the same with any large and complicated program. Compilers are usually better tested than other software, but they also have some parts which are very complicated compared to your typical software. Especially when its for a language as complex as C++, its almost inevitable that there will be bugs that nobody finds until they try to compile code with some obscure set of language features, and then some particular set of transformations gets applied by the optimizer. At least 99.9% of all code out there uses safe combinations of features that are well exercised by the compiler team's tests, and are generally very reliable.

But its certainly not 100%. GCC is considered a robust and stable compiler, right? But here is a list of GCC versions that are unable to compile LLVM correctly.

Edit: example from a different domain, AAA console games... the game I worked on last year had more than 50 programmers working on it, and an overall team size well over 200. During the last few months, our team fixed tens of thousands of bugs in both the data and the code. On the most prolific day, over 1600 bugs were fixed in a single day! Many of them were trivial little bugs, but still safe enough to fix. Others were very serious bugs that had to be fixed even if fixing them was a bit risky. As we got closer and closer to shipping, the risk of any particular change was increased, so the benefits of fixing it had to be progressively higher in order to justify allowing the change. At the very end, only fixes for the most serious bugs were allowed, and relatively minor (but still significant) bugs were just marked 'will not fix'. I'm sure there were at least a thousand known, unfixed bugs in the shipped game. They were all either pretty minor bugs, or unlikely to be encountered except by a very small fraction of the players. A thousand known bugs might sound like a lot, but the codebase was a couple of million lines of C++ and the game data contained literally hundreds of thousands of complex data assets. Its almost impossible to put something that big together without bugs.

wgarvin · Post by **wgarvin** » Thu Jul 15, 2010 6:00 am

On the other hand, sometimes its the language that's the problem, not the compiler.

Some things that programmers like to rely on (such as 2's complement integer overflow) are actually in undefined-behaviour territory, and optimized versions of the code might not do what you expect.

Bo Persson · Post by **Bo Persson** » Thu Jul 15, 2010 9:16 pm

wgarvin wrote:On the other hand, sometimes its the language that's the problem, not the compiler.

Some things that programmers like to rely on (such as 2's complement integer overflow) are actually in undefined-behaviour territory, and optimized versions of the code might not do what you expect.

If the language doesn't even require 2's complement, how could it define the result of it overflowing?

There is a basic idea of implementability behind most of C's and C++'s undefined behavior. The implementation can use whatever the hardware does.

Other languages, with a very well defined behavior - like Java - is actually less portable because of that.

For example, you need extra hardware to run Java efficiently on an IBM mainframe, because the floating point format used there since the 1960's didn't match the language spec. C and C++ runs fine, because those languages don't require a specific format.

Gian-Carlo Pascutto · Post by **Gian-Carlo Pascutto** » Thu Jul 15, 2010 9:27 pm

Bo Persson wrote:C and C++ runs fine, because those languages don't require a specific format.

Don't they require float and double to be IEEE compliant?

Dann Corbit · Post by **Dann Corbit** » Thu Jul 15, 2010 9:30 pm

Gian-Carlo Pascutto wrote:
Bo Persson wrote:C and C++ runs fine, because those languages don't require a specific format.
Don't they require float and double to be IEEE compliant?

No.

Dann Corbit · Post by **Dann Corbit** » Thu Jul 15, 2010 9:36 pm

Dann Corbit wrote:
Gian-Carlo Pascutto wrote:
Bo Persson wrote:C and C++ runs fine, because those languages don't require a specific format.
Don't they require float and double to be IEEE compliant?
No.

Consider, for instance, some Cray supercompuers do not use IEEE float or old systems like the CDC Cyber also do not use IEEE float. It would not be possible to write an efficient C compiler for such a machine.

Compiler Problem

Re: Compiler Problem

Re: Compiler Problem

Re: Compiler Problem

Re: Compiler Problem

Re: Compiler Problem

Re: Compiler Problem

Re: Compiler Problem

Re: Compiler Problem

Re: Compiler Problem

Re: Compiler Problem