Code: Select all
#if (CTHostCpuX86 && CTArchBits32 && CTAllowAssembly)
static __inline__ unsigned int __ffz32(unsigned int theB32)
{
__asm__("bsfl %1,%0" : "=r" (theB32) :"r" (~theB32));
return theB32;
}
#endif
Moderator: Ras
Code: Select all
#if (CTHostCpuX86 && CTArchBits32 && CTAllowAssembly)
static __inline__ unsigned int __ffz32(unsigned int theB32)
{
__asm__("bsfl %1,%0" : "=r" (theB32) :"r" (~theB32));
return theB32;
}
#endif
use "bsfq/bsrq" (quadword format)...sje wrote:When compiling using g++ 4.x for a 32 bit x86 target, I use the following code for FindFirstZero:And it works. However, I'm having some difficulty writing a 64 bit version. Any clues for the clueless?Code: Select all
#if (CTHostCpuX86 && CTArchBits32 && CTAllowAssembly) static __inline__ unsigned int __ffz32(unsigned int theB32) { __asm__("bsfl %1,%0" : "=r" (theB32) :"r" (~theB32)); return theB32; } #endif
The zero check is made in the caller.bob wrote:use "bsfq/bsrq" (quadword format)...sje wrote:When compiling using g++ 4.x for a 32 bit x86 target, I use the following code for FindFirstZero:And it works. However, I'm having some difficulty writing a 64 bit version. Any clues for the clueless?Code: Select all
#if (CTHostCpuX86 && CTArchBits32 && CTAllowAssembly) static __inline__ unsigned int __ffz32(unsigned int theB32) { __asm__("bsfl %1,%0" : "=r" (theB32) :"r" (~theB32)); return theB32; } #endif
The only danger is if the argument is zero when you call the function. The result is undefined and most X86 processors do not change the destination register if that is true. The inline64.h file in Crafty has a MSB/LSB function that does exactly this, but it returns 64 if no bit is set...
sje wrote:The zero check is made in the caller.bob wrote:use "bsfq/bsrq" (quadword format)...sje wrote:When compiling using g++ 4.x for a 32 bit x86 target, I use the following code for FindFirstZero:And it works. However, I'm having some difficulty writing a 64 bit version. Any clues for the clueless?Code: Select all
#if (CTHostCpuX86 && CTArchBits32 && CTAllowAssembly) static __inline__ unsigned int __ffz32(unsigned int theB32) { __asm__("bsfl %1,%0" : "=r" (theB32) :"r" (~theB32)); return theB32; } #endif
The only danger is if the argument is zero when you call the function. The result is undefined and most X86 processors do not change the destination register if that is true. The inline64.h file in Crafty has a MSB/LSB function that does exactly this, but it returns 64 if no bit is set...
Questions:
1) Doesn't the formal parameter declaration have to be changed to unsigned long long (64 bit)?
2) But we still want to return an unsigned int (32 bit), so doesn't the return statement have to be modified?[/quote]
Here's my MSB inline:On a 64 bit machine, you don't need long long, "long" will do the trick since the machine has 64 bit words. But since microsoft decided that a 16 bit value was a word many years ago, and then that a 32 bit value must be a doubleword, we are now left with quadwords on 64 bit machines...Code: Select all
int static __inline__ MSB(long word) { long dummy, dummy2; asm(" bsrq %1, %0" "\n\t" " jnz 1f" "\n\t" " movq $64, %0" "\n\t" "1:":"=&r"(dummy), "=&r" (dummy2) : "1"((long) (word)) : "cc"); return (dummy); }
BTW why would you want to return a 32 bit value on a 64 bit architecture? It isn't any faster and the native registers are 64 bits on the x86-64 processors.
[/quote]bob wrote:sje wrote:The zero check is made in the caller.bob wrote:use "bsfq/bsrq" (quadword format)...sje wrote:When compiling using g++ 4.x for a 32 bit x86 target, I use the following code for FindFirstZero:And it works. However, I'm having some difficulty writing a 64 bit version. Any clues for the clueless?Code: Select all
#if (CTHostCpuX86 && CTArchBits32 && CTAllowAssembly) static __inline__ unsigned int __ffz32(unsigned int theB32) { __asm__("bsfl %1,%0" : "=r" (theB32) :"r" (~theB32)); return theB32; } #endif
The only danger is if the argument is zero when you call the function. The result is undefined and most X86 processors do not change the destination register if that is true. The inline64.h file in Crafty has a MSB/LSB function that does exactly this, but it returns 64 if no bit is set...
Questions:
1) Doesn't the formal parameter declaration have to be changed to unsigned long long (64 bit)?2) But we still want to return an unsigned int (32 bit), so doesn't the return statement have to be modified?
Here's my MSB inline:On a 64 bit machine, you don't need long long, "long" will do the trick since the machine has 64 bit words. But since microsoft decided that a 16 bit value was a word many years ago, and then that a 32 bit value must be a doubleword, we are now left with quadwords on 64 bit machines...Code: Select all
int static __inline__ MSB(long word) { long dummy, dummy2; asm(" bsrq %1, %0" "\n\t" " jnz 1f" "\n\t" " movq $64, %0" "\n\t" "1:":"=&r"(dummy), "=&r" (dummy2) : "1"((long) (word)) : "cc"); return (dummy); }
BTW why would you want to return a 32 bit value on a 64 bit architecture? It isn't any faster and the native registers are 64 bits on the x86-64 processors.
Code: Select all
/*
AMD Opteron inline functions for MSB(), LSB() and
PopCnt(). Note that these are 64 bit functions and they use
64 bit (quad-word) X86-64 instructions.
*/
int static __inline__ MSB(long word)
{
long dummy, dummy2;
asm(" bsrq %1, %0" "\n\t" " jnz 1f" "\n\t" " m
"=&r"
(dummy2)
: "1"((long) (word))
: "cc");
return (dummy);
}
int static __inline__ LSB(long word)
{
long dummy, dummy2;
asm(" bsfq %1, %0" "\n\t" " jnz 1f" "\n\t" " m
"=&r"
(dummy2)
: "1"((long) (word))
: "cc");
return (dummy);
}
int static __inline__ PopCnt(long word)
{
long dummy, dummy2, dummy3;
asm(" xorq %0, %0" "\n\t" " testq %1, %1" "\n\t" "
"=&r"
(dummy3)
: "1"((long) (word))
: "cc");
return (dummy);
}
I guess that once he sees how you did it, he will know what to do.Dann Corbit wrote:bob wrote:sje wrote:The zero check is made in the caller.bob wrote:use "bsfq/bsrq" (quadword format)...sje wrote:When compiling using g++ 4.x for a 32 bit x86 target, I use the following code for FindFirstZero:And it works. However, I'm having some difficulty writing a 64 bit version. Any clues for the clueless?Code: Select all
#if (CTHostCpuX86 && CTArchBits32 && CTAllowAssembly) static __inline__ unsigned int __ffz32(unsigned int theB32) { __asm__("bsfl %1,%0" : "=r" (theB32) :"r" (~theB32)); return theB32; } #endif
The only danger is if the argument is zero when you call the function. The result is undefined and most X86 processors do not change the destination register if that is true. The inline64.h file in Crafty has a MSB/LSB function that does exactly this, but it returns 64 if no bit is set...
Questions:
1) Doesn't the formal parameter declaration have to be changed to unsigned long long (64 bit)?2) But we still want to return an unsigned int (32 bit), so doesn't the return statement have to be modified?
Here's my MSB inline:On a 64 bit machine, you don't need long long, "long" will do the trick since the machine has 64 bit words. But since microsoft decided that a 16 bit value was a word many years ago, and then that a 32 bit value must be a doubleword, we are now left with quadwords on 64 bit machines...Code: Select all
int static __inline__ MSB(long word) { long dummy, dummy2; asm(" bsrq %1, %0" "\n\t" " jnz 1f" "\n\t" " movq $64, %0" "\n\t" "1:":"=&r"(dummy), "=&r" (dummy2) : "1"((long) (word)) : "cc"); return (dummy); }
BTW why would you want to return a 32 bit value on a 64 bit architecture? It isn't any faster and the native registers are 64 bits on the x86-64 processors.
Code: Select all
/*
AMD Opteron inline functions for MSB(), LSB() and
PopCnt(). Note that these are 64 bit functions and they use
64 bit (quad-word) X86-64 instructions.
*/
int static __inline__ MSB(long word)
{
long dummy, dummy2;
asm(" bsrq %1, %0" "\n\t" " jnz 1f" "\n\t" " m
"=&r"
(dummy2)
: "1"((long) (word))
: "cc");
return (dummy);
}
int static __inline__ LSB(long word)
{
long dummy, dummy2;
asm(" bsfq %1, %0" "\n\t" " jnz 1f" "\n\t" " m
"=&r"
(dummy2)
: "1"((long) (word))
: "cc");
return (dummy);
}
int static __inline__ PopCnt(long word)
{
long dummy, dummy2, dummy3;
asm(" xorq %0, %0" "\n\t" " testq %1, %1" "\n\t" "
"=&r"
(dummy3)
: "1"((long) (word))
: "cc");
return (dummy);
}
Code: Select all
/*
AMD Opteron inline functions for MSB(), LSB() and
PopCnt(). Note that these are 64 bit functions and they use
64 bit (quad-word) X86-64 instructions.
*/
int static __inline__ MSB(long word)
{
long dummy, dummy2;
asm(" bsrq %1, %0" "\n\t"
" jnz 1f" "\n\t"
" movq $64, %0" "\n\t"
"1:":"=&r"(dummy), "=&r"
(dummy2)
: "1"((long) (word))
: "cc");
return (dummy);
}