Yes much slower (atleast 3x slower on my P4) because Critical selection does a spin-wait first then does a slow OS call to block. Crafty's spinning assembly locks are also a tad-bit slower when there is high contention because it does an interlocked exchange first then tests. Here's what I'm using in Buzz:Martin Fierz wrote:are the windows functions EnterCriticalSection etc. much slower than the assembly code i saw in crafty's source code?
Code: Select all
#ifndef INLINE
#ifdef _MSC_VER
#define INLINE __forceinline
#elif defined(__GNUC__)
#define INLINE __inline__ __attribute__((always_inline))
#else
#define INLINE inline
#endif
#endif
#if defined(__GNUC__)
typedef volatile int SpinLock[1];
typedef volatile int* const SpinLock_P;
static INLINE int volatile LockedExchange(SpinLock_P Target, const int Value)
{
int ret = Value;
__asm__
(
"xchgl %[ret], %[Target]"
: [ret] "+r" (ret)
: [Target] "m" (*Target)
: "memory"
);
return ret;
}
#elif defined(_MSC_VER)
typedef volatile long SpinLock[1];
typedef volatile long* const SpinLock_P;
#include <intrin.h>
#pragma intrinsic (_InterlockedExchange)
#define LockedExchange(Target,Value) _InterlockedExchange(Target,Value)
#else
#error Unspported Compiler
#endif
#define IsLocked(s) ((s)[0])
#define SetLocked(s,boolean) ((s)[0]=(boolean))
#define ResetSpinLock(s) SetLocked(s,0)
static INLINE void volatile Release(SpinLock_P s) {SetLocked(s,0);}
static INLINE int volatile TryLock(SpinLock_P s) {return !(IsLocked(s) || LockedExchange(s,1));}
static INLINE void volatile Lock(SpinLock_P s) {while(IsLocked(s) || LockedExchange(s,1));}http://www.intel.com/cd/ids/developer/a ... 333935.htm
In my opinion, this is the best starting place:is there anything (paper, website etc) that describes some fundamentals of multi-threades game playing programs? i found a thread which said that the thesis by valavan manohararajah was a good starting point, so i read it - but i don't think it's a good starting point at all
http://www.netlib.org/utk/lsi/pcwLSI/text/node351.html
It's slow and boring but for this I read up online on Windows API and POSIX Threads API.no pseudocode for the actual splitting operations, no discussion of sharing stuff etc. is there anything else, preferably on a lower level?
Good luck with your parallel engine.cheers
martin

