long is 32-bit even with a 64-bit version of Windows.bob wrote:Which is OK since pointers are also 32 bits in 32 bit O/S....rbarreira wrote:Better yet, use "long long" or uint64_t from stdint.h.bob wrote:I'd prefer (long) to be 64 bit safe... that is what I use...rbarreira wrote:Casting a pointer to an int looks like an overflow waiting to happen. Or am I missing something?hgm wrote:It is quite normal that hash probes are slow: they need DRAM access, which takes many hundreds of CPU clock cycles. Probing the same entry twice is not much more expensive, though, as the second time it will be in cache. So this is in accordance with what you see.
If your results are reasonable depends on too many things to say. For instance how elaborate your evaluation is. And how often you probe.
Non-alignment of a hash bucket with the cache line can cause extra DRAM accesses, if in one probe you access two cache lines (requiring two dRAM accesses to fetch, possibly triggering prefetch of a third). To know, print the start adress of your hash table, and check if it is devisable by 64 (printf("hash start = %x\n", hashTable);).
To force alignment, use code like:
Code: Select all
hashMem = (Bcket *) malloc(nrOfBuckets + 1, sizeof Bucket); hashTable = (Bucket *) (((int)hashMem + 63) & ~63);
long is still 32-bit in Windows world.
Is a querying the hash tables such a huge bottleneck?
Moderators: hgm, Rebel, chrisw
-
- Posts: 900
- Joined: Tue Apr 27, 2010 3:48 pm
Re: Is a querying the hash tables such a huge bottleneck?
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: Is a querying the hash tables such a huge bottleneck?
Can't be unless you use a 32 bit compiler. Long is 64 bits on every 64 bit system I have ever run on, from Cray to present. "long long" is not an ANSI/POSIX standard, it is something gcc added way back to support 64 bit ints on machines that had 64 bit data types even when they had 32 bit address spaces.rbarreira wrote:long is 32-bit even with a 64-bit version of Windows.bob wrote:Which is OK since pointers are also 32 bits in 32 bit O/S....rbarreira wrote:Better yet, use "long long" or uint64_t from stdint.h.bob wrote:I'd prefer (long) to be 64 bit safe... that is what I use...rbarreira wrote:Casting a pointer to an int looks like an overflow waiting to happen. Or am I missing something?hgm wrote:It is quite normal that hash probes are slow: they need DRAM access, which takes many hundreds of CPU clock cycles. Probing the same entry twice is not much more expensive, though, as the second time it will be in cache. So this is in accordance with what you see.
If your results are reasonable depends on too many things to say. For instance how elaborate your evaluation is. And how often you probe.
Non-alignment of a hash bucket with the cache line can cause extra DRAM accesses, if in one probe you access two cache lines (requiring two dRAM accesses to fetch, possibly triggering prefetch of a third). To know, print the start adress of your hash table, and check if it is devisable by 64 (printf("hash start = %x\n", hashTable);).
To force alignment, use code like:
Code: Select all
hashMem = (Bcket *) malloc(nrOfBuckets + 1, sizeof Bucket); hashTable = (Bucket *) (((int)hashMem + 63) & ~63);
long is still 32-bit in Windows world.
-
- Posts: 900
- Joined: Tue Apr 27, 2010 3:48 pm
Re: Is a querying the hash tables such a huge bottleneck?
You are wrong, and you could have realized that with a 10-seconds google search:bob wrote:Can't be unless you use a 32 bit compiler. Long is 64 bits on every 64 bit system I have ever run on, from Cray to present. "long long" is not an ANSI/POSIX standard, it is something gcc added way back to support 64 bit ints on machines that had 64 bit data types even when they had 32 bit address spaces.rbarreira wrote: long is 32-bit even with a 64-bit version of Windows.
http://stackoverflow.com/questions/3845 ... it-windows
http://portabilityblog.com/blog/archive ... flong.html
Also, the C99 standard has the long long type.
-
- Posts: 287
- Joined: Mon Mar 13, 2006 5:23 pm
- Location: Québec
Re: Is a querying the hash tables such a huge bottleneck?
humm, nice hack. I would not have though of that.bob wrote:malloc() 63 bytes more than you need. Then add 63 to the address malloc() returns, and AND that with ~63 (0xffffffc0). Now the address of the first hash entry is on a 64-byte boundary, and each group of 4 will go exactly into one cache line with no overlap.mathmoi wrote:How do you do that?bob wrote:One more question: Do you force the "bucket" to lie on a 64 byte address boundary so that you are only incurring one hash line fill, as opposed to two?
Mathieu Pagé
mathieu@mathieupage.com
mathieu@mathieupage.com
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: Is a querying the hash tables such a huge bottleneck?
Crafty uses "typedef BITBOARD long" on systems that work with "HAS_64BITS". I am using both Intel and GCC on my 64 bit linux boxes, and I can _guarantee_ you that a "long" is 64 bits:rbarreira wrote:You are wrong, and you could have realized that with a 10-seconds google search:bob wrote:Can't be unless you use a 32 bit compiler. Long is 64 bits on every 64 bit system I have ever run on, from Cray to present. "long long" is not an ANSI/POSIX standard, it is something gcc added way back to support 64 bit ints on machines that had 64 bit data types even when they had 32 bit address spaces.rbarreira wrote: long is 32-bit even with a 64-bit version of Windows.
http://stackoverflow.com/questions/3845 ... it-windows
http://portabilityblog.com/blog/archive ... flong.html
Also, the C99 standard has the long long type.
using intel C compiler on 64 bit linux:
int main() {
int x;
long y;
printf("sizeof(x)=%d sizeof(y)=%d\n", sizeof(x), sizeof(y));
}
scrappy% ./tst
sizeof(x)=4 sizeof(y)=8
For gcc:
scrappy% ./tst
sizeof(x)=4 sizeof(y)=8
So I am not "wrong" at all. I suppose it is possible that "long long" was added to the POSIX standard at some point since every vendor had added that. But it is not needed as you can clearly see above.
Same code on windows produces the same results for 64 bit Windows Vista running gcc (my home box).. I don't have any 32 bit machines but did in the past, and there "long" was 32 bits as was "int"...
But in the above, the point was not about "long long" but about the simple fact that on 64 bit machines, long is 64 bits, int could be 32 or 64 (was 64 on Cray, is 32 on X86_64, was 64 on most non-Intel 64 bit processors such as SPARC and MIPS)
-
- Posts: 900
- Joined: Tue Apr 27, 2010 3:48 pm
Re: Is a querying the hash tables such a huge bottleneck?
Why are you making test programs on linux when we're talking about Windows?bob wrote:Crafty uses "typedef BITBOARD long" on systems that work with "HAS_64BITS". I am using both Intel and GCC on my 64 bit linux boxes, and I can _guarantee_ you that a "long" is 64 bits:rbarreira wrote:You are wrong, and you could have realized that with a 10-seconds google search:bob wrote:Can't be unless you use a 32 bit compiler. Long is 64 bits on every 64 bit system I have ever run on, from Cray to present. "long long" is not an ANSI/POSIX standard, it is something gcc added way back to support 64 bit ints on machines that had 64 bit data types even when they had 32 bit address spaces.rbarreira wrote: long is 32-bit even with a 64-bit version of Windows.
http://stackoverflow.com/questions/3845 ... it-windows
http://portabilityblog.com/blog/archive ... flong.html
Also, the C99 standard has the long long type.
using intel C compiler on 64 bit linux:
int main() {
int x;
long y;
printf("sizeof(x)=%d sizeof(y)=%d\n", sizeof(x), sizeof(y));
}
scrappy% ./tst
sizeof(x)=4 sizeof(y)=8
For gcc:
scrappy% ./tst
sizeof(x)=4 sizeof(y)=8
So I am not "wrong" at all. I suppose it is possible that "long long" was added to the POSIX standard at some point since every vendor had added that. But it is not needed as you can clearly see above.
Same code on windows produces the same results for 64 bit Windows Vista running gcc (my home box).. I don't have any 32 bit machines but did in the past, and there "long" was 32 bits as was "int"...
But in the above, the point was not about "long long" but about the simple fact that on 64 bit machines, long is 64 bits, int could be 32 or 64 (was 64 on Cray, is 32 on X86_64, was 64 on most non-Intel 64 bit processors such as SPARC and MIPS)
I just looked at the crafty source code and this is what you have:
Code: Select all
# if defined(HAS_64BITS)
typedef unsigned long BITBOARD;
# elif defined(NT_i386)
typedef unsigned __int64 BITBOARD;
# else
typedef unsigned long long BITBOARD;
# endif
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: Is a querying the hash tables such a huge bottleneck?
DId you _read_ my last comment? Windows vista 64 bits + 64 bit gcc gives the expected sizeof(long) = 8. So I _did_ test windows. I don't use the microsoft C++ compiler so I have no idea what they do. But anyone that uses sizeof(long)=anything_other_than_8 on a 64 bit architecture is certainly brain-dead. But MS has never let that stop them in the past...rbarreira wrote:Why are you making test programs on linux when we're talking about Windows?bob wrote:Crafty uses "typedef BITBOARD long" on systems that work with "HAS_64BITS". I am using both Intel and GCC on my 64 bit linux boxes, and I can _guarantee_ you that a "long" is 64 bits:rbarreira wrote:You are wrong, and you could have realized that with a 10-seconds google search:bob wrote:Can't be unless you use a 32 bit compiler. Long is 64 bits on every 64 bit system I have ever run on, from Cray to present. "long long" is not an ANSI/POSIX standard, it is something gcc added way back to support 64 bit ints on machines that had 64 bit data types even when they had 32 bit address spaces.rbarreira wrote: long is 32-bit even with a 64-bit version of Windows.
http://stackoverflow.com/questions/3845 ... it-windows
http://portabilityblog.com/blog/archive ... flong.html
Also, the C99 standard has the long long type.
using intel C compiler on 64 bit linux:
int main() {
int x;
long y;
printf("sizeof(x)=%d sizeof(y)=%d\n", sizeof(x), sizeof(y));
}
scrappy% ./tst
sizeof(x)=4 sizeof(y)=8
For gcc:
scrappy% ./tst
sizeof(x)=4 sizeof(y)=8
So I am not "wrong" at all. I suppose it is possible that "long long" was added to the POSIX standard at some point since every vendor had added that. But it is not needed as you can clearly see above.
Same code on windows produces the same results for 64 bit Windows Vista running gcc (my home box).. I don't have any 32 bit machines but did in the past, and there "long" was 32 bits as was "int"...
But in the above, the point was not about "long long" but about the simple fact that on 64 bit machines, long is 64 bits, int could be 32 or 64 (was 64 on Cray, is 32 on X86_64, was 64 on most non-Intel 64 bit processors such as SPARC and MIPS)
I just looked at the crafty source code and this is what you have:
I'm guessing that Visual C++ doesn't define HAS_64BITS and it's ending up in the last typedef.Code: Select all
# if defined(HAS_64BITS) typedef unsigned long BITBOARD; # elif defined(NT_i386) typedef unsigned __int64 BITBOARD; # else typedef unsigned long long BITBOARD; # endif
BTW the crafty code you quoted has been there since 1995 or so. More recent versions use long on HAS_64BITS architectures with Linux. I've never changed the windows stuff since __int64 has been around since 1995 as well...
-
- Posts: 313
- Joined: Wed Mar 08, 2006 8:18 pm
Re: Is a querying the hash tables such a huge bottleneck?
Bob,bob wrote:DId you _read_ my last comment? Windows vista 64 bits + 64 bit gcc gives the expected sizeof(long) = 8. So I _did_ test windows. I don't use the microsoft C++ compiler so I have no idea what they do. But anyone that uses sizeof(long)=anything_other_than_8 on a 64 bit architecture is certainly brain-dead. But MS has never let that stop them in the past...
BTW the crafty code you quoted has been there since 1995 or so. More recent versions use long on HAS_64BITS architectures with Linux. I've never changed the windows stuff since __int64 has been around since 1995 as well...
You might want to read the C standard documents some times. Especially the C99 one which actually talk about 64 bit integers.
short is 16 bits.
int is defined as the natural word size. It is *not* required to be 32 or 64 bits. It can be 16 bits, even. It's up to whatever the compiler writer chooses. Implementation defined. On a 32 bit compiler, it should be 32 bits, of course. On a 64 bit system, 64 bit int's would be the more appropriate size.
long is defined as at least 32 bits. On a 64 bit system, it can be either 32 or 64 bits.
long long is defined as at least 64 bits.
So it is entirely possible for 'int' to be 16 bits and 'long' at 32 bits and 'long long' at 64 bits. Not likely, of course. But possible and standard conforming.
That's why the C99 standard set up the stdint.h header to provide a nice portable way the programmer can be guaranteed of having the right size integers when they need it.
For C89, it doesn't even know about 64 bit integers, and the compiler writer is on his own and the programmer should actually check to make sure. In this case, 'int' is often 32 and 'long' 64 bits. But that violates the C convention / requirement that 'int' is the machine word size.
Of course, we've had this discussion before, and I'm well aware you don't like the C99 standard because not all compilers (<cough> Microsoft <cough>) support it.
But don't go saying that 'long' *must* be 64 bits because it doesn't. And that would even violate tradition and break a large number of programs that expect it at 32 bits. (Yes yes, I know programmer's shouldn't write a program that expects 'int' or 'long' to be a certain size without first checking, but many programs have done so in the past 25 years. Very few people these days actually bother to check to make sure a var type is the right size. They just assume 'int' is 32 bits and 'long' is probably 32 bits too.)
I'm not saying you should change your code because it's wrong or ugly or doesn't work or anything else. I'm just saying that 'long' does *NOT* have to be 64 bits. The C standard is pretty clear about that.
If you want to say that 'long' is 64 bits on the compilers you use and so that's what you program for, then that's a perfectly fine statement.
-
- Posts: 838
- Joined: Thu Jul 05, 2007 5:03 pm
- Location: British Columbia, Canada
Re: Is a querying the hash tables such a huge bottleneck?
I think on 64-bit MS compilers, long remains 32 bits. This is for compatibility with all the retarded code out there that assumes that certain types are certain sizes.
Suggestion: Don't write retarded code that assumes certain types are certain sizes. Make your own typedefs with known size, and just define them to the right things for each platform you compile on. Using the <stdint.h> types (uint64_t etc.) is a good bet for compilers/platforms that have one. But its always nice to be able to roll your own.
The other side of this equation is that whenever you do anything that relies on the size of the result type (example: subtracting two smaller-than-type-X quantities which are meant to represent counters that can wrap around, and then sign extending it to type X) you have to be careful of the integer promotion rules. If int is 32 bits on some platforms and 64 on others, that can affect the result of intermediate calculations. So use explicit casts when it matters.
Suggestion: Don't write retarded code that assumes certain types are certain sizes. Make your own typedefs with known size, and just define them to the right things for each platform you compile on. Using the <stdint.h> types (uint64_t etc.) is a good bet for compilers/platforms that have one. But its always nice to be able to roll your own.
The other side of this equation is that whenever you do anything that relies on the size of the result type (example: subtracting two smaller-than-type-X quantities which are meant to represent counters that can wrap around, and then sign extending it to type X) you have to be careful of the integer promotion rules. If int is 32 bits on some platforms and 64 on others, that can affect the result of intermediate calculations. So use explicit casts when it matters.
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: Is a querying the hash tables such a huge bottleneck?
I actually have looked at this. Many times. And yes, I am aware that the standard does not say what "long" means, exactly, just that it means >= int in all cases.Carey wrote:Bob,bob wrote:DId you _read_ my last comment? Windows vista 64 bits + 64 bit gcc gives the expected sizeof(long) = 8. So I _did_ test windows. I don't use the microsoft C++ compiler so I have no idea what they do. But anyone that uses sizeof(long)=anything_other_than_8 on a 64 bit architecture is certainly brain-dead. But MS has never let that stop them in the past...
BTW the crafty code you quoted has been there since 1995 or so. More recent versions use long on HAS_64BITS architectures with Linux. I've never changed the windows stuff since __int64 has been around since 1995 as well...
You might want to read the C standard documents some times. Especially the C99 one which actually talk about 64 bit integers.
However, for as long as I can remember, "long" has always been "the longest integer data type the hardware supports." Which is both logical and rational. Cray has always had int = long since it is a 64 bit architecture. With the quirk that pointers were always 32 bits because the a-registers (used for address computations) were 32 bits (eventually, early on they were 24 bits).
But for "long" on true 64 bit systems, I have yet to find a single case where long is < 64 bits, except, apparently some version of the microsoft C compiler (I have not verified which versions, if any, or if all, follow this rather odd practice). I did, long ago, verify that at least for linux, on any 64 bit platform I have tried, long = 64 bits. For X86_64, both gcc and icc follow this. And on my home windows vista 64 box, gcc does the same.
Not much more to say on the subject. Yes, the "standard" is lousy in many ways, such as why is an int always signed by default, while a char may be signed or unsigned at the whim of the compiler writer? "long long" was a non-standard extension that I used in 1995 in the first versions of Crafty when running on a 64 bit sun Sparc as well as on 32 bit X86/linux boxes.
short is 16 bits.
int is defined as the natural word size. It is *not* required to be 32 or 64 bits. It can be 16 bits, even. It's up to whatever the compiler writer chooses. Implementation defined. On a 32 bit compiler, it should be 32 bits, of course. On a 64 bit system, 64 bit int's would be the more appropriate size.
long is defined as at least 32 bits. On a 64 bit system, it can be either 32 or 64 bits.
long long is defined as at least 64 bits.
So it is entirely possible for 'int' to be 16 bits and 'long' at 32 bits and 'long long' at 64 bits. Not likely, of course. But possible and standard conforming.
That's why the C99 standard set up the stdint.h header to provide a nice portable way the programmer can be guaranteed of having the right size integers when they need it.
For C89, it doesn't even know about 64 bit integers, and the compiler writer is on his own and the programmer should actually check to make sure. In this case, 'int' is often 32 and 'long' 64 bits. But that violates the C convention / requirement that 'int' is the machine word size.
Of course, we've had this discussion before, and I'm well aware you don't like the C99 standard because not all compilers (<cough> Microsoft <cough>) support it.
But don't go saying that 'long' *must* be 64 bits because it doesn't.
I do not believe I said "it must be 64 bits". I rather said that it has been 64 bits on _every_ 64 bit architecture and compiler combination I have tried. You can look at the Crafty Makefile to see just how comprehensive that list of architectures and O/S choices is. Apparently there is / was / could-be some MS compiler that is/was brain-dead...
Actually, I think that _is_ what I said.And that would even violate tradition and break a large number of programs that expect it at 32 bits. (Yes yes, I know programmer's shouldn't write a program that expects 'int' or 'long' to be a certain size without first checking, but many programs have done so in the past 25 years. Very few people these days actually bother to check to make sure a var type is the right size. They just assume 'int' is 32 bits and 'long' is probably 32 bits too.)
I'm not saying you should change your code because it's wrong or ugly or doesn't work or anything else. I'm just saying that 'long' does *NOT* have to be 64 bits. The C standard is pretty clear about that.
If you want to say that 'long' is 64 bits on the compilers you use and so that's what you program for, then that's a perfectly fine statement.