ChessUSA.com TalkChess.com
Hosted by Your Move Chess & Games
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

microsecond-accurate timing on Windows
Goto page 1, 2, 3, 4, 5  Next
 
Post new topic       TalkChess.com Forum Index -> Computer Chess Club: Programming and Technical Discussions Threaded
View previous topic :: View next topic  
Author Message
Martin Sedlak



Joined: 26 Nov 2010
Posts: 701

PostPosted: Mon May 28, 2012 10:20 am    Post subject: microsecond-accurate timing on Windows Reply to topic Reply with quote

I have been thinking recently about time measurement in Windows.
GetTickCount (has granularity of ~15 msec which is bad)
timeGetTime depends on last period set using with timeBeginPeriod,
the docs say it is system-global, so if an app uses timeBeginPeriod(40)
while your program is running, you immediately get worse granularity
than GetTickCount. Also it is said that this affects thread scheduling so better keep hands off it.
QueryPerformanceCounter/QueryPerformanceFrequency:
ultra high resolution, if i remember these calls used to take longer than GetTickCount. Not always available.
rdtsc instruction: very fast but there may be problems with multiple cores and dynamic changes in CPU freq, nonportable
Are there better alternatives I missed?

Now what if I wanted to measure in microseconds instead of some units (usually close to CPU clock freq) which QPF provides?
Here's my solution using QPC/QPF. Note that it's fairly slow on a 32-bit machine here, getMicrosec() itself takes about 1-2 microseconds here () which is already a lot.
Anyway, here's my solution with implementation in case it's useful to someone: (left out some impl. details but I guess it's self-explanatory)

Code:

static volatile signed char init = 0;
static i64 freq;
static u32 shortFreq;
static i64 lastTick;
static i64 remainder = 0;
static i64 emul = 0;
static Mutex usMutex;
static Mutex gMutex;
static i32 lastTC;

i64 getMicrosec()
{
   if ( init == -1 )
   {
      LARGE_INTEGER tmp;
      QueryPerformanceCounter( &tmp );
      i64 cur = (i64)tmp.QuadPart;
      {
         MutexLock _( usMutex );
         i64 delta = cur - lastTick + remainder;
         i64 us = delta * 1000000 / shortFreq;
         emul += us;
         remainder = delta - us * freq / 1000000;
         lastTick = cur;
         return emul;
      }
   }
   if ( init == 1 )
   {
      LARGE_INTEGER tmp;
      QueryPerformanceCounter( &tmp );
      i64 cur = (i64)tmp.QuadPart;
      {
         MutexLock _( usMutex );
         i64 delta = cur - lastTick + remainder;
         i64 us = delta * 1000000 / freq;
         emul += us;
         remainder = delta - us * freq / 1000000;
         lastTick = cur;
         return emul;
      }
   }
   if ( init == 2 )
   {
      MutexLock _( usMutex );
      i32 cur = (i32)GetTickCount();
      i32 delta = cur - lastTC;
      i64 tmp = (i64)delta * 1000;
      lastTC = cur;
      return emul += tmp;
   }
   
   // initialize

   MutexLock _( gMutex );

   LARGE_INTEGER frq;
   if ( QueryPerformanceFrequency( &frq ) == FALSE )
   {
      // not available => use milisec emulation
      init = 2;
      lastTC = GetTickCount();
   } else {
      freq = (i64)frq.QuadPart;
      LARGE_INTEGER tmp;
      QueryPerformanceCounter( &tmp );
      lastTick = (i64)tmp.QuadPart;

      if ( freq <= 0xffffffffU )
      {
         shortFreq = (u32)freq;
         init = -1;
      }
      else
      {
         init = 1;
      }
   }
   return getMicrosec();
}

i32 getMillisec()
{
       return (i32)(getMicrosec()/1000 & 0xffffffffU);
}


Basically what this does is similar to GetTickCount except that it has a microsecond resolution and uses 64-bit output counter. And getMillisec() is a 1-millisecond accurate version.
Of course if perfcounter is not available the code falls back to emulation using GetTickCount().
Back to top
View user's profile Send private message
Martin Sedlak



Joined: 26 Nov 2010
Posts: 701

PostPosted: Mon May 28, 2012 12:56 pm    Post subject: Re: microsecond-accurate timing on Windows Reply to topic Reply with quote

Final version, enhanced getMicrosec() stability, wraps tested (both pos=>neg and neg=>pos in 2's complement).
Fixed getMillisec(), wraps tested.

Code:

static volatile signed char init = 0;
static u64 freq;
static u32 shortFreq;
static u64 lastTick;
static u64 remainder = 0;
static u64 emul = 0;   //-1000000 + 0*0x7fffffffffffffffLL;
static Mutex usMutex;
static Mutex gMutex;
static u32 lastTC;

i64 getMicrosec()
{
   if ( init == -1 )
   {
      LARGE_INTEGER tmp;
      QueryPerformanceCounter( &tmp );
      u64 cur = (u64)tmp.QuadPart;
      {
         MutexLock _( usMutex );
         u64 delta = cur - lastTick + remainder;
         if ( delta >= shortFreq )            // enhance stability if interval is longer than one sec
         {
            emul += (delta/shortFreq) * 1000000U;
            delta %= shortFreq;
         }
         u64 us = delta * 1000000U / shortFreq;
         emul += us;
         remainder = delta - us * freq / 1000000U;
         lastTick = cur;
         return (i64)emul;
      }
   }
   if ( init == 1 )
   {
      LARGE_INTEGER tmp;
      QueryPerformanceCounter( &tmp );
      u64 cur = (u64)tmp.QuadPart;
      {
         MutexLock _( usMutex );
         u64 delta = cur - lastTick + remainder;
         if ( delta >= freq )               // enhance stability if interval is longer than one sec
         {
            emul += (delta/freq) * 1000000U;
            delta %= freq;
         }
         u64 us = delta * 1000000U / freq;
         emul += us;
         remainder = delta - us * freq / 1000000U;
         lastTick = cur;
         return (i64)emul;
      }
   }
   if ( init == 2 )
   {
      MutexLock _( usMutex );
      u32 cur = (u32)GetTickCount();
      u32 delta = cur - lastTC;
      u64 tmp = (u64)delta * 1000;
      lastTC = cur;
      return (i64)(emul += tmp);
   }
   
   // initialize
       {
   MutexLock _( gMutex );

   LARGE_INTEGER frq;
   if ( QueryPerformanceFrequency( &frq ) == FALSE )
   {
      // not available => use milisec emulation
      init = 2;
      lastTC = (u32)GetTickCount();
   } else {
      freq = (u64)frq.QuadPart;
      LARGE_INTEGER tmp;
      QueryPerformanceCounter( &tmp );
      lastTick = (u64)tmp.QuadPart;

      if ( freq <= 0xffffffffU )
      {
         shortFreq = (u32)freq;
         init = -1;
      }
      else
      {
         init = 1;
      }
   }
        }
   return getMicrosec();
}


Code:

static Mutex counterMutex;
static u64 counterUs = 0;
static u32 counterMs = 0;
static int counterInit = 0;

i32 getMillisec()
{
   u64 us = (u64)getMicrosec();
   MutexLock _( counterMutex );
   if ( !counterInit )
   {
      counterInit = 1;
      counterUs = us;
      return counterMs;
   }
   u64 delta = us - counterUs;
   if ( delta /= 1000 )
   {
      counterMs += (u32)(delta & 0xffffffff);
      counterUs += delta * 1000;
   }
   return (i32)counterMs;
}
Back to top
View user's profile Send private message
Vincent Diepeveen



Joined: 09 Mar 2006
Posts: 1738
Location: The Netherlands

PostPosted: Mon May 28, 2012 2:44 pm    Post subject: Re: microsecond-accurate timing on Windows Reply to topic Reply with quote

from what i understand GetTickCount() in kernel of windows is a simple register move. so it gives time in milliseconds but should be accurate at several nanoseconds, as the latency between the register move and when you receive it is really little. The explanation for that is a plausible one.

With an atomic clock attached you can measure the real accuracy.

In general most systems ugh out if you do too many timing calls per second. Especially big supercomputers which have special processors for time (clock processors), so be careful when measuring.

So when i want to have a CPU spin another round for a few microseconds, what i do is at startup measure how long X spins take with X rather big. Then you divide that back to a bunch of microseconds. You'll deal with some overflows so need 64 bits math to get it right; yet it'll do the job more than ok. Using GetTickCount() for this is more than ok.

Vincent
Back to top
View user's profile Send private message Send e-mail Visit poster's website MSN Messenger
Martin Sedlak



Joined: 26 Nov 2010
Posts: 701

PostPosted: Mon May 28, 2012 3:22 pm    Post subject: Re: microsecond-accurate timing on Windows Reply to topic Reply with quote

diep wrote:
from what i understand GetTickCount() in kernel of windows is a simple register move. so it gives time in milliseconds but should be accurate at several nanoseconds, as the latency between the register move and when you receive it is really little. The explanation for that is a plausible one.

With an atomic clock attached you can measure the real accuracy.

Vincent


I'm not sure how it's implemented. Probably it uses some counter which gets incremented on hardware timer interrupt, who knows.
But the problem with GetTickCount is that it has a resolution/granularity of ~16 milliseconds which makes it useless for most practical purposes. Actually IMO resolution worse than 5 msecs is useless for any realtime application.

After lots of googling I figured out that QPC is not stable on all systems. Lots of people reported instability on some systems (multicore or powersaving), which would mean it uses TSC anyway which of course won't work under these conditions. So I'm definitely dropping QueryPerformanceCounter and switching to timeGetTime. I believe it has 1 msec accuracy on most Windows systems today anyway.
Back to top
View user's profile Send private message
Vincent Diepeveen



Joined: 09 Mar 2006
Posts: 1738
Location: The Netherlands

PostPosted: Mon May 28, 2012 3:30 pm    Post subject: Re: microsecond-accurate timing on Windows Reply to topic Reply with quote

mar wrote:
diep wrote:
from what i understand GetTickCount() in kernel of windows is a simple register move. so it gives time in milliseconds but should be accurate at several nanoseconds, as the latency between the register move and when you receive it is really little. The explanation for that is a plausible one.

With an atomic clock attached you can measure the real accuracy.

Vincent


I'm not sure how it's implemented. Probably it uses some counter which gets incremented on hardware timer interrupt, who knows.
But the problem with GetTickCount is that it has a resolution/granularity of ~16 milliseconds which makes it useless for most practical purposes. Actually IMO resolution worse than 5 msecs is useless for any realtime application.

After lots of googling I figured out that QPC is not stable on all systems. Lots of people reported instability on some systems (multicore or powersaving), which would mean it uses TSC anyway which of course won't work under these conditions. So I'm definitely dropping QueryPerformanceCounter and switching to timeGetTime. I believe it has 1 msec accuracy on most Windows systems today anyway.


Let me write it again. It has granularity internal of within a nanosecond that gets converted. So if it says something has been eating 3 milliseconds, it can be 3.000 or 3.999 but not 4 milliseconds and definitely not 2.1 milliseconds.

So i'm also using this to measure how much system time Diep eats. And if i add up the times, it never has an error of more than 1 millisecond, so the information given there to me seems to be correct that is a register move.
Back to top
View user's profile Send private message Send e-mail Visit poster's website MSN Messenger
Martin Sedlak



Joined: 26 Nov 2010
Posts: 701

PostPosted: Mon May 28, 2012 3:36 pm    Post subject: Re: microsecond-accurate timing on Windows Reply to topic Reply with quote

diep wrote:
So when i want to have a CPU spin another round for a few microseconds, what i do is at startup measure how long X spins take with X rather big.

Yes that's probably the only way to do delays at microsecond and less scale.
What happens if another process demands resources during the calibration? Could happen (in theory).

But what I want is to measure time interval.
Back to top
View user's profile Send private message
Vincent Diepeveen



Joined: 09 Mar 2006
Posts: 1738
Location: The Netherlands

PostPosted: Mon May 28, 2012 3:39 pm    Post subject: Re: microsecond-accurate timing on Windows Reply to topic Reply with quote

mar wrote:
diep wrote:
So when i want to have a CPU spin another round for a few microseconds, what i do is at startup measure how long X spins take with X rather big.

Yes that's probably the only way to do delays at microsecond and less scale.
What happens if another process demands resources during the calibration? Could happen (in theory).

But what I want is to measure time interval.


where do you need this for?

you do realize there is special ways to debug software code using the information the cpu has? That is nanosecond accurate, same trick like kernel is using for GetTickCount()

however many cpu's nowadays throttle and turboboost and whatever, so i always prefer measurements of several seconds of whatever.
Back to top
View user's profile Send private message Send e-mail Visit poster's website MSN Messenger
Martin Sedlak



Joined: 26 Nov 2010
Posts: 701

PostPosted: Mon May 28, 2012 5:10 pm    Post subject: Re: microsecond-accurate timing on Windows Reply to topic Reply with quote

diep wrote:
Let me write it again. It has granularity internal of within a nanosecond that gets converted. So if it says something has been eating 3 milliseconds, it can be 3.000 or 3.999 but not 4 milliseconds and definitely not 2.1 milliseconds.

It depends on system you use, though i doubt you get 1ms granularity with GetTickCount().

Here's what I get on XP using GetTickCount():
delta = 0 ms
delta = 16 ms
delta = 31 ms
delta = 47 ms
delta = 63 ms
...

and using timeGetTime():

delta = 0 ms
delta = 1 ms
delta = 2 ms
delta = 3 ms
delta = 4 ms
...

With other words, anything that takes less that 16 msec, measured using GetTickCount, will report either 0ms or 16ms.
Back to top
View user's profile Send private message
Martin Sedlak



Joined: 26 Nov 2010
Posts: 701

PostPosted: Mon May 28, 2012 5:20 pm    Post subject: Re: microsecond-accurate timing on Windows Reply to topic Reply with quote

diep wrote:

where do you need this for?


I have an OGL app and need to measure time between frames to update logic. I get framerates around 150 so 16 ms granularity is not enough for me.

diep wrote:

you do realize there is special ways to debug software code using the information the cpu has? That is nanosecond accurate, same trick like kernel is using for GetTickCount()

Yes, you probably mean timestamp counter on pentium and later. There are also hardware debug registers to break at a certain execution address or I/O port access, memory R/W access and so on. Already a 386 could do that.

diep wrote:

however many cpu's nowadays throttle and turboboost and whatever, so i always prefer measurements of several seconds of whatever.

I agree.
Back to top
View user's profile Send private message
Vincent Diepeveen



Joined: 09 Mar 2006
Posts: 1738
Location: The Netherlands

PostPosted: Mon May 28, 2012 6:23 pm    Post subject: Re: microsecond-accurate timing on Windows Reply to topic Reply with quote

mar wrote:
diep wrote:

where do you need this for?


I have an OGL app and need to measure time between frames to update logic. I get framerates around 150 so 16 ms granularity is not enough for me.

diep wrote:

you do realize there is special ways to debug software code using the information the cpu has? That is nanosecond accurate, same trick like kernel is using for GetTickCount()

Yes, you probably mean timestamp counter on pentium and later. There are also hardware debug registers to break at a certain execution address or I/O port access, memory R/W access and so on. Already a 386 could do that.

diep wrote:

however many cpu's nowadays throttle and turboboost and whatever, so i always prefer measurements of several seconds of whatever.

I agree.


Maybe ask in graphics group if you want such turbo framerate in OpenGL.

Hopefully your users are all on some new sort of AICAR drugs, the new undetectable form of EPO, gonna be problem in London. Your users really need some superdrug like that if they want to make chance keep up with 150 frames per second game.
Back to top
View user's profile Send private message Send e-mail Visit poster's website MSN Messenger
Display posts from previous:   
Post new topic       TalkChess.com Forum Index -> Computer Chess Club: Programming and Technical Discussions All times are GMT
Goto page 1, 2, 3, 4, 5  Next
Threaded
Page 1 of 5

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum




Powered by phpBB © 2001, 2005 phpBB Group
Enhanced with Moby Threads