64-bit and 32-bit exes producing different results

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

JVMerlino
Posts: 1357
Joined: Wed Mar 08, 2006 10:15 pm
Location: San Francisco, California

64-bit and 32-bit exes producing different results

Post by JVMerlino »

Many thanks in advance to anybody who can provide some guidance on this.

In a nutshell, the 32-bit and 64-bit versions of Myrddin have always given very different results in analysis mode -- not just move count but PV and sometimes even best move. It's only now that I've decided to devote some time to the problem.

I thought it might be the compile setup, since the two versions are compiled on two different machines (but with the same compiler, Visual Studio 2010, and compile/link settings). But when Jim Ablett's compiles also produce the same issue, I start to suspect the code itself, as I seem to recall that Jim does not exclusively use VS for his builds.

Jim pointed me to a thread from a couple of years ago about Stockfish exhibiting the same problem, but that was due to a MS library sort function which Myrddin does not use. I've searched through all of Myrddin code many times, and cannot find any 64-bit specific code, and I'm just not familiar enough with the MS libraries to guess at which functions might be causing this problem.

Again, any help will be very much appreciated (you'll be mentioned in the release notes!) :D

jm
rbarreira
Posts: 900
Joined: Tue Apr 27, 2010 3:48 pm

Re: 64-bit and 32-bit exes producing different results

Post by rbarreira »

Do you use any external libraries which might change the behavior of your program? For example random number generators.

Failing that, it means it's something internal to your code. In that case, you probably have a bug somewhere (accessing uninitialized memory or an invalid memory location for example).
JVMerlino
Posts: 1357
Joined: Wed Mar 08, 2006 10:15 pm
Location: San Francisco, California

Re: 64-bit and 32-bit exes producing different results

Post by JVMerlino »

rbarreira wrote:Do you use any external libraries which might change the behavior of your program? For example random number generators.

Failing that, it means it's something internal to your code. In that case, you probably have a bug somewhere (accessing uninitialized memory or an invalid memory location for example).
I do not. All of my zobrist hashing values are in a fixed table.

As I type this, Andrew Fan (Firefly) is looking at it. His debug builds show identical behavior, but the release builds do not. Which is even stranger because there is definitely no debug-specific code -- not even asserts. :?

The mystery deepens....

jm
User avatar
Desperado
Posts: 879
Joined: Mon Dec 15, 2008 11:45 am

Re: 64-bit and 32-bit exes producing different results

Post by Desperado »

@Ricardo

thought of this too, but ms compiler will warn you if you run in debug mode and there is an access on uninitialized memory (variable).
But not sure of the circumstances, so it might be a good point to start.

@John

Another idea is to check implicit casts, or more general types behaviour.
Also structure alignment can differ i think, which can cause problems
using the sizeof operator which will lead to any kind of problems.
And many other things of course....

So before thinking longer about the problem, i want to ask how to
...different results in analysis mode -- not just move count but ...
understand this statement. How do you compare movecount in analysis mode ?
Or the other way around, did you get different nodecounts, different results in any form when you doing fix depth searches ?

Michael
rbarreira
Posts: 900
Joined: Tue Apr 27, 2010 3:48 pm

Re: 64-bit and 32-bit exes producing different results

Post by rbarreira »

If everything else fails, you can always try disabling parts of the program (for example extensions, quiescence search, etc.) and after disabling a lot of stuff, seeing if the node counts match then. If they do, you can start enabling parts one by one until the two versions start behaving differently.

The last part you enabled must contain/trigger the problem, which will give further clues or even make the problem obvious.
Last edited by rbarreira on Tue Jul 26, 2011 11:10 pm, edited 1 time in total.
JVMerlino
Posts: 1357
Joined: Wed Mar 08, 2006 10:15 pm
Location: San Francisco, California

Re: 64-bit and 32-bit exes producing different results

Post by JVMerlino »

Desperado wrote:@Ricardo

thought of this too, but ms compiler will warn you if you run in debug mode and there is an access on uninitialized memory (variable).
But not sure of the circumstances, so it might be a good point to start.

@John

Another idea is to check implicit casts, or more general types behaviour.
Also structure alignment can differ i think, which can cause problems
using the sizeof operator which will lead to any kind of problems.
And many other things of course....

So before thinking longer about the problem, i want to ask how to
...different results in analysis mode -- not just move count but ...
understand this statement. How do you compare movecount in analysis mode ?
Or the other way around, did you get different nodecounts, different results in any form when you doing fix depth searches ?

Michael
By nodecount difference, I mean that for a specific PV output, even if all else is the same. Andrew is still poking, and he is now saying that:

1) Release and Debug builds do not show the same behavior to each other, for both 32-bit and 64-bit.
2) 32-bit and 64-bit Release builds also differ.
3) 32-bit and 64-bit Debug builds are identical.

Investigation continues....

jm
User avatar
Desperado
Posts: 879
Joined: Mon Dec 15, 2008 11:45 am

Re: 64-bit and 32-bit exes producing different results

Post by Desperado »

Because it is not so easy for me to express myself in english very well,
i will just give some short examples of the errorTypes i'm thinking of.

Code: Select all


guess 1: uninitialized memory
====================================

void example(void)
{
 int value;

 value++; ...
}


guess 2: uninteneded use of memory
====================================

void example(void)
{
 //bishop
 while(tmp) {src=bsf64(tmp); value+=pst[bishop][src];}
 
 //king
 value += pst[king][src]; -> should be pst[king][posKing] instead of src...
 
}

guess 3: 
=====================================

struct test_t
{ 
 ui08_t a;
 ui32_t b;
 ui16_t c;
 ui08_t d;
};

sizeof operator will report 12 Byte (not 8)! 

I know this is all about guessing (and there are many more possibilities), but at least the first two examples are
_typical_ for debug/release differences. Also for 32/64 bit differences.
So just some ideas to start with.

Now if there is a bug in this category, then i think the only way to get rid
of it, is stepwise enable/disable code parts.

Michael
JVMerlino
Posts: 1357
Joined: Wed Mar 08, 2006 10:15 pm
Location: San Francisco, California

Re: 64-bit and 32-bit exes producing different results

Post by JVMerlino »

I understood you perfectly, so no problem. :) I know that there is no problem with your example #1, and I'm aware of the issue with your example #3, but not entirely sure how it might cause problems.

Andrew determined that if you turn off my hash code completely, the problem goes away. So now I have a place to target my efforts, since my hash code is pretty simple.

Just to make sure, does this code look like it might cause a problem?

Code: Select all

void SaveHash(CHESSMOVE *cmMove, int nDepth, int nEval, BYTE nFlags, int nPly, PosSignature dwSignature)
{
    PosSignature	index = (dwSignature & (dwHashSize - 1));
    HASH_ENTRY		*pentry = HashTable + index;
....
where PosSignature is defined as a DWORD in both 32-bit and 64-bit, and HASH_ENTRY is defined as:

Code: Select all

typedef struct HASH_ENTRY
{
    PosSignature	dwSignature;
    WORD			nAge;
    short			nEval;
    MoveFlagType	moveflag;
    BYTE			nFlags;
    BYTE			nDepth;
    SquareType		from;
    SquareType		to;
} HASH_ENTRY;
and MoveFlagType is an unsigned short and SquareType is an unsigned char? So, without padding, one hash entry is 14 bytes, and is padded up to 16 bytes?

dwHashSize is the total number of entries in the hash table.

Many thanks,
jm
rbarreira
Posts: 900
Joined: Tue Apr 27, 2010 3:48 pm

Re: 64-bit and 32-bit exes producing different results

Post by rbarreira »

How do you calculate dwHashSize? Hopefully you're not assuming that sizeof (HASH_ENTRY) is a power of 2, or that it is the same for both versions.
User avatar
Desperado
Posts: 879
Joined: Mon Dec 15, 2008 11:45 am

Re: 64-bit and 32-bit exes producing different results

Post by Desperado »

Just a quick idea before i go to bed :) .

The problem may be caused by the _&_ operation when the size of
_dwHashsize_ is not longer a power of 2.

My example is padded to 8 bytes (not 12). if i would use it without
knowing the issue my _dwHashsize_ would not be a power of 2
and the & operation would fail.

Michael