compiler dependent scores!

Discussion of chess software programming and technical issues.

Moderator: Ras

Karlo Bala
Posts: 373
Joined: Wed Mar 22, 2006 10:17 am
Location: Novi Sad, Serbia
Full name: Karlo Balla

Re: getting VERY weird

Post by Karlo Bala »

jswaff wrote:
Alessandro Scotti wrote:Have you tried to declare t as volatile?
I just did, and this does work:

Code: Select all

                                for (volatile int t=1;t<8;t++) {
//                                        if (c==E4) printf("considering t=%d\n",t);
                                        if (c-(t*8)<0) break;
                                        if (c==E4) printf("adding bm_mask %d\n",c-(t*8));
                                        to_boundary[c][k]|=bm_mask[c-(t*8)];
                                }
Why? t is local in scope. Looks like the compiler is optimizing most of my loop away when it shouldn't be.

--
James
Can you post ASM fragment with int t, and volatile int t?
It looks to me that bug is somewhere else and this is just side effect. But who knows, intel know how to over-optimize and speed up the code :twisted:
Best Regards,
Karlo Balla Jr.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: compiler dependent scores!

Post by bob »

jwes wrote:Since the problem appears to happen at ply 1, you could dump the ply 1 trees with both versions and see where they differ and then go in with your debugger, or even just start stepping with your debugger until you get a wrong number.
Chances are good that "dumping the tree" will change the behavior as now you are introducing a ton of library calls that are going to stomp all over the stack and change the uninitialized values from what they are when no printf's are done...
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: getting VERY weird

Post by bob »

Karlo Bala wrote:
jswaff wrote:
Alessandro Scotti wrote:Have you tried to declare t as volatile?
I just did, and this does work:

Code: Select all

                                for (volatile int t=1;t<8;t++) {
//                                        if (c==E4) printf("considering t=%d\n",t);
                                        if (c-(t*8)<0) break;
                                        if (c==E4) printf("adding bm_mask %d\n",c-(t*8));
                                        to_boundary[c][k]|=bm_mask[c-(t*8)];
                                }
Why? t is local in scope. Looks like the compiler is optimizing most of my loop away when it shouldn't be.

--
James
Can you post ASM fragment with int t, and volatile int t?
It looks to me that bug is somewhere else and this is just side effect. But who knows, intel know how to over-optimize and speed up the code :twisted:
I agree

Declaring t as volatile just prevents the compiler from keeping the value of t in a register and re-using it when needed, it now has to load it from memory for each and every reference...
Gerd Isenberg
Posts: 2251
Joined: Wed Mar 08, 2006 8:47 pm
Location: Hattingen, Germany

Re: getting VERY weird

Post by Gerd Isenberg »

jswaff wrote:Very strange goings on..

I traced the problem back to the bitboard array to_boundary[Square][Direction]. The North, NorthWest, NorthEast directions only have one bit set. They are initialized as follows.

Code: Select all

        // precompute bitboards from each square to boundary
        for (int c=0;c<64;c++) {
                for (int k=0;k<8;k++) {
                        to_boundary[c][k]=0;
                        switch (k) {
                        case N: /* up */
                                if (c==E4) {
                                   printf("building to_boundary[E4][N]...\n");
                                }
                                for (int t=1;t<8;t++) {
           ------->                  if (c==E4) printf("considering t=%d\n",t);
                                        if (c-(t*8)<0) break;
                                        if (c==E4) printf("adding bm_mask %d\n",c-(t*8));
                                        to_boundary[c][k]|=bm_mask[c-(t*8)];
                                }
                                break;
                        case NW: /* upleft */

                       <snipped>
Your code may simply confuse the optimizing compiler ;-)

Such control-structures

Code: Select all

for (dir = 0; dir < 8; dir++)
{
   switch (dir)
   {
   case 0:foo0(...); break;
   case 1:foo1(...); break;
   case 2:foo2(...); break;
   case 3:foo3(...); break;
   case 4:foo4(...); break;
   case 5:foo5(...); break;
   case 6:foo6(...); break;
   case 7:foo7(...); break;
   }
}
should simpler be written as follows:

Code: Select all

foo0(...);
foo1(...);
foo2(...);
foo3(...);
foo4(...);
foo5(...);
foo6(...);
foo7(...);
I suggest something like this:

Code: Select all

U64 nortBound = C64(0x0101010101010100); // a1 north
U64 soutBound = C64(0x0080808080808080); // h8 south
for (int sq=0; sq < 64; sq++)
  to_boundary[sq]   [Nort] = nortBound;
  to_boundary[sq^63][Sout] = soutBound;
  nortBound <<= 1;
  soutBound >>= 1;
}
and similar code but tad trickier for ranks and diagonals.
http://chessprogramming.wikispaces.com/ ... mpty+Board

Gerd
wgarvin
Posts: 838
Joined: Thu Jul 05, 2007 5:03 pm
Location: British Columbia, Canada

Re: getting VERY weird

Post by wgarvin »

bob wrote: I agree

Declaring t as volatile just prevents the compiler from keeping the value of t in a register and re-using it when needed, it now has to load it from memory for each and every reference...
Broken optimizers are not unheard of, especially at higher optimization levels. If adding volatile to t makes the problem go away, then one of two things is probably happening: either the optimizer is screwing up the generated code, or else your code is actually overwriting stack memory in some way, and forcing t to be volatile has changed the stack layout in a way that hides the problem.

I've had bugs similar to this before, where a loop was mysteriously being terminated after 16 iterations instead of running for the proper ~200 iterations. It turned out that I had a char buf[4] which was never supposed to contain more than 3 letters plus a NUL, but the buffer was being filled with 5 or more letters. The resulting memory trashing *overwrote my loop counter on the stack*. It sounds suspiciously like your problem, actually.

Another common occurrence (though it doesn't appear to be the culprit in this case) is programmers breaking language rules by e.g. explicit casting of pointer types. It is possible to do things that have technically undefined behaviour, but work on 90%+ of compilers out there, and then fail in subtle ways on the remaining compilers because of aggressive optimizations that make assumptions that aren't valid because of the pointer casts.
jswaff

A complete and broken program

Post by jswaff »

A complete program demonstrating this problem is below. This problem occurs with Intel's compiler at -O2 (strangely this is working at -O1 where -O1 it didn't in Prophet). This is on a 32 bit machine (the original problem in Prophet was on a 64 bit machine).

With the printf commented out:

Code: Select all

00000000
00000000
00000000
00001000
00000000
00000000
00000000
00000000
With it in:

Code: Select all

considering: 1
considering: 2
considering: 3
considering: 4
considering: 5
00001000
00001000
00001000
00001000
00000000
00000000
00000000
00000000

Code: Select all

#include <stdio.h>

enum SQUARES { A8,B8,C8,D8,E8,F8,G8,H8,
               A7,B7,C7,D7,E7,F7,G7,H7,
               A6,B6,C6,D6,E6,F6,G6,H6,
               A5,B5,C5,D5,E5,F5,G5,H5,
               A4,B4,C4,D4,E4,F4,G4,H4,
               A3,B3,C3,D3,E3,F3,G3,H3,
               A2,B2,C2,D2,E2,F2,G2,H2,
               A1,B1,C1,D1,E1,F1,G1,H1,
               NO_SQUARE };

typedef unsigned long long Bitmap;
Bitmap bm_mask[64];
Bitmap to_boundary[64];


void DrawBitmap(Bitmap bmap) {
        for (int c=0;c<64;c++) {
                if (bmap&bm_mask[c]) printf("1"); else printf("0");
                if ((c%8)==7) printf("\n");
        }
        printf("\n");
}

void Initialize() {

        Bitmap b=1;
        for (int c=0;c<64;c++)
                bm_mask[c]=(b<<c);


        for (int c=0;c<64;c++) {
                to_boundary[c]=0;
                for (int t=1;t<8;t++) {
//                      if (c==E4) printf("considering: %d\n",t);
                        if (c-(t*8)<0) break;
                        to_boundary[c]|=bm_mask[c-(t*8)];
                }
        }

        DrawBitmap(to_boundary[E4]);

}

int main(int argc,char *argv[]) {
        Initialize();
        return 0;
}
jswaff

Re: getting VERY weird

Post by jswaff »

Gerd Isenberg wrote: Your code may simply confuse the optimizing compiler ;-)

Such control-structures

Code: Select all

for (dir = 0; dir < 8; dir++)
{
   switch (dir)
   {
   case 0:foo0(...); break;
   case 1:foo1(...); break;
   case 2:foo2(...); break;
   case 3:foo3(...); break;
   case 4:foo4(...); break;
   case 5:foo5(...); break;
   case 6:foo6(...); break;
   case 7:foo7(...); break;
   }
}
should simpler be written as follows:

Code: Select all

foo0(...);
foo1(...);
foo2(...);
foo3(...);
foo4(...);
foo5(...);
foo6(...);
foo7(...);
Of course. :) I'll fix that silliness, but unfortunately that wasn't the problem.
I suggest something like this:

Code: Select all

U64 nortBound = C64(0x0101010101010100); // a1 north
U64 soutBound = C64(0x0080808080808080); // h8 south
for (int sq=0; sq < 64; sq++)
  to_boundary[sq]   [Nort] = nortBound;
  to_boundary[sq^63][Sout] = soutBound;
  nortBound <<= 1;
  soutBound >>= 1;
}
That is far too hard to read for my taste, especially for something that's only performed on initialization.
and similar code but tad trickier for ranks and diagonals.
http://chessprogramming.wikispaces.com/ ... mpty+Board

Gerd
Gerd Isenberg
Posts: 2251
Joined: Wed Mar 08, 2006 8:47 pm
Location: Hattingen, Germany

Re: getting VERY weird

Post by Gerd Isenberg »

jswaff wrote:

Code: Select all

U64 nortBound = C64(0x0101010101010100); // a1 north
U64 soutBound = C64(0x0080808080808080); // h8 south
for (int sq=0; sq < 64; sq++)
  to_boundary[sq]   [Nort] = nortBound;
  to_boundary[sq^63][Sout] = soutBound;
  nortBound <<= 1;
  soutBound >>= 1;
}
That is far too hard to read for my taste, especially for something that's only performed on initialization.
Really? I find it much simpler to understand than your inner loop ;-)
May be this becomes simpler:

Code: Select all

U64 nort = C64(0x0101010101010100);
for (int sq=0; sq < 64; sq++, nort <<= 1)
   rayAttacks[sq][Nort] = nort;

U64 sout = C64(0x0080808080808080);
for (int sq=63; sq >= 0; sq--, sout >>= 1) {
   rayAttacks[sq][Sout] = sout;
}
I agree, code size or execution speed is not that critical in initialization stuff. I prefer short code and actually use a generalized kogge-stone getter.

Code: Select all

U64 single = 1;
for (int sq=0; sq < 64; sq++, single <<= 1)
for (int d=0; d < 8; d++)
   rayAttacks[sq][d] = dirAttacks(single, -1, d);
wgarvin
Posts: 838
Joined: Thu Jul 05, 2007 5:03 pm
Location: British Columbia, Canada

Re: A complete and broken program

Post by wgarvin »

jswaff wrote:A complete program demonstrating this problem is below. This problem occurs with Intel's compiler at -O2 (strangely this is working at -O1 where -O1 it didn't in Prophet). This is on a 32 bit machine (the original problem in Prophet was on a 64 bit machine).
That is pretty disturbing. The array accesses all look legal and there's no pointer math or anything. There's no problematic aliasing there either (but check if there's a compiler option to turn off aggressive aliasing optimizations anyway?).

The next step is probably to put a breakpoint in your test program compiled with the wrong code, and look at the disassembly in the debugger to figure out if the code it generated for your loop is just plain wrong.

[Edit: ...in the wrongly-compiled version, does it eliminate the loop variable t entirely? And then get the condition for the loop wrong, causing it to exit on the first iteration instead of when it should?]
Michael Sherwin
Posts: 3196
Joined: Fri May 26, 2006 3:00 am
Location: WY, USA
Full name: Michael Sherwin

Re: getting VERY weird

Post by Michael Sherwin »

Have you verified that bm_mask[] is correct at the different levels of optimization?
If you are on a sidewalk and the covid goes beep beep
Just step aside or you might have a bit of heat
Covid covid runs through the town all day
Can the people ever change their ways
Sherwin the covid's after you
Sherwin if it catches you you're through