microsecond-accurate timing on Windows

Discussion of chess software programming and technical issues.

Moderator: Ras

diep
Posts: 1822
Joined: Thu Mar 09, 2006 11:54 pm
Location: The Netherlands

Re: microsecond-accurate timing on Windows

Post by diep »

mar wrote:
diep wrote:Let me write it again. It has granularity internal of within a nanosecond that gets converted. So if it says something has been eating 3 milliseconds, it can be 3.000 or 3.999 but not 4 milliseconds and definitely not 2.1 milliseconds.
It depends on system you use, though i doubt you get 1ms granularity with GetTickCount().

Here's what I get on XP using GetTickCount():
delta = 0 ms
delta = 16 ms
delta = 31 ms
delta = 47 ms
delta = 63 ms
...

and using timeGetTime():

delta = 0 ms
delta = 1 ms
delta = 2 ms
delta = 3 ms
delta = 4 ms
...

With other words, anything that takes less that 16 msec, measured using GetTickCount, will report either 0ms or 16ms.
yes i am thinking this is not correct function call for what you want to achieve.

You are measuring wrong thing probably. You are measuring runqueue latency. It fires in theory at 100Hz. So minimum delay 10 milliseconds. In reality with 2 cpu's here it's a lot slower that kernel of course. We see that if we do a sleep(0) it's soon 31-47 milliseconds. Not 10 milliseconds at all :)

Kernel obviously at 16 ms granularity as you call it. It's just antique shjt.

However we get total zero latency from the timer itself of course. Just don't flood it. Kernel might idle you or push it away or whatever i don't know. XP is still supported by microsoft by the way?

But to say it very correct. Each time kernel wakes up and checks runqueue it is updating the register of GetTickCount(), do i say it ok like this?

So call is very cheap. timing is very accurate, just not the microseconds level you want it to be. So you sure need other functioncall for what you want :)

If you do 2 measurements short after each other without flooding the kernel you get real timing see here 0 seconds latency between each call:

Code: Select all

#include <windows.h>
#include <stdio.h>

int main(void) {
  unsigned int Tstart = (unsigned int)GetTickCount(),Tprev=Tstart;
  int i = 0;

  while( i < 50 ) {
    unsigned int t1,t2,delta,absdelta;

    t1       = (unsigned int)GetTickCount();
    t2       = (unsigned int)GetTickCount();
    delta    = t2-t1;
    absdelta = t1-Tstart;
   

    printf("t2-t1 = %u  absolute time +%u = %u\n",
     t2-t1,t1-Tprev,absdelta);
    Sleep(0); // now we sleep for a while to not flood kernel 
               // with requests or it might idle us for a while
    i++;
    Tprev = t1;
  }
  return 0;
}
/* Output:



C:\test>ticktock
t2-t1 = 0 absolute time +0 = 0
t2-t1 = 0 absolute time +46 = 46
t2-t1 = 0 absolute time +32 = 78
t2-t1 = 0 absolute time +46 = 124
t2-t1 = 0 absolute time +32 = 156
t2-t1 = 0 absolute time +46 = 202
t2-t1 = 0 absolute time +32 = 234
t2-t1 = 0 absolute time +46 = 280
t2-t1 = 0 absolute time +32 = 312
t2-t1 = 0 absolute time +31 = 343
t2-t1 = 0 absolute time +47 = 390
t2-t1 = 0 absolute time +31 = 421
t2-t1 = 0 absolute time +31 = 452
t2-t1 = 0 absolute time +47 = 499
t2-t1 = 0 absolute time +31 = 530
t2-t1 = 0 absolute time +31 = 561
t2-t1 = 0 absolute time +47 = 608
t2-t1 = 0 absolute time +31 = 639
t2-t1 = 0 absolute time +31 = 670
t2-t1 = 0 absolute time +47 = 717
t2-t1 = 0 absolute time +31 = 748
t2-t1 = 0 absolute time +32 = 780
t2-t1 = 0 absolute time +46 = 826
t2-t1 = 0 absolute time +32 = 858
t2-t1 = 0 absolute time +31 = 889
t2-t1 = 0 absolute time +47 = 936
t2-t1 = 0 absolute time +31 = 967
t2-t1 = 0 absolute time +31 = 998
t2-t1 = 0 absolute time +47 = 1045
t2-t1 = 0 absolute time +31 = 1076
t2-t1 = 0 absolute time +31 = 1107
t2-t1 = 0 absolute time +47 = 1154
t2-t1 = 0 absolute time +31 = 1185
t2-t1 = 0 absolute time +31 = 1216
t2-t1 = 0 absolute time +32 = 1248
t2-t1 = 0 absolute time +46 = 1294
t2-t1 = 0 absolute time +32 = 1326
t2-t1 = 0 absolute time +31 = 1357
t2-t1 = 0 absolute time +47 = 1404
t2-t1 = 0 absolute time +31 = 1435
t2-t1 = 0 absolute time +31 = 1466
t2-t1 = 0 absolute time +47 = 1513
t2-t1 = 0 absolute time +31 = 1544
t2-t1 = 0 absolute time +31 = 1575
t2-t1 = 0 absolute time +47 = 1622
t2-t1 = 0 absolute time +31 = 1653
t2-t1 = 0 absolute time +31 = 1684
t2-t1 = 0 absolute time +47 = 1731
t2-t1 = 0 absolute time +31 = 1762
t2-t1 = 0 absolute time +32 = 1794

*/
Last edited by diep on Mon May 28, 2012 9:14 pm, edited 1 time in total.
mar
Posts: 2659
Joined: Fri Nov 26, 2010 2:00 pm
Location: Czech Republic
Full name: Martin Sedlak

Re: microsecond-accurate timing on Windows

Post by mar »

diep wrote: Maybe ask in graphics group if you want such turbo framerate in OpenGL.

Hopefully your users are all on some new sort of AICAR drugs, the new undetectable form of EPO, gonna be problem in London. Your users really need some superdrug like that if they want to make chance keep up with 150 frames per second game.
LOL :) No need for drugs. Depends on how fast stuff moves, not on framerate.

The problem is that with vsync on, say monitor does 75Hz. But with 16ms granularity I get 63Hz. What happens is that one frame in that case is 0ms and another 32ms, so I get nonuniformly distributed logic update rate of about 32Hz, which certainly doesn't look smooth. But I agree that we're a bit off topic here. timeGetTime did the trick. Thanks for your answers anyway.
diep
Posts: 1822
Joined: Thu Mar 09, 2006 11:54 pm
Location: The Netherlands

Re: microsecond-accurate timing on Windows

Post by diep »

mar wrote:
diep wrote: Maybe ask in graphics group if you want such turbo framerate in OpenGL.

Hopefully your users are all on some new sort of AICAR drugs, the new undetectable form of EPO, gonna be problem in London. Your users really need some superdrug like that if they want to make chance keep up with 150 frames per second game.
LOL :) No need for drugs. Depends on how fast stuff moves, not on framerate.

The problem is that with vsync on, say monitor does 75Hz. But with 16ms granularity I get 63Hz. What happens is that one frame in that case is 0ms and another 32ms, so I get nonuniformly distributed logic update rate of about 32Hz, which certainly doesn't look smooth. But I agree that we're a bit off topic here. timeGetTime did the trick. Thanks for your answers anyway.
Yes but in graphics they use other trick for this. Render everything behind visibility of user and then put entire screen at once to visible display. So no problems with eyes then from interlaced screen.
diep
Posts: 1822
Joined: Thu Mar 09, 2006 11:54 pm
Location: The Netherlands

Re: microsecond-accurate timing on Windows

Post by diep »

mar wrote:
diep wrote: Maybe ask in graphics group if you want such turbo framerate in OpenGL.

Hopefully your users are all on some new sort of AICAR drugs, the new undetectable form of EPO, gonna be problem in London. Your users really need some superdrug like that if they want to make chance keep up with 150 frames per second game.
LOL :) No need for drugs. Depends on how fast stuff moves, not on framerate.

The problem is that with vsync on, say monitor does 75Hz. But with 16ms granularity I get 63Hz. What happens is that one frame in that case is 0ms and another 32ms, so I get nonuniformly distributed logic update rate of about 32Hz, which certainly doesn't look smooth. But I agree that we're a bit off topic here. timeGetTime did the trick. Thanks for your answers anyway.
Are you sure about timeGetTime()?

It has 100% same behaviour like GetTickCount() here.
Look:

Code: Select all

#include <windows.h>
#include <Mmsystem.h>
#include <stdio.h>

#define BITBOARD unsigned long long

int main(void) {
  unsigned long long nattempts = 0ULL;
  unsigned int x, i = 0, dummy;


  if( timeBeginPeriod(1) == TIMERR_NOERROR ) 
    printf("set timer\n");
  else
    printf("error setting timer\n");
  
  x = (unsigned int)timeGetTime();//GetTickCount();   
  while( i < 50 ) {
    unsigned int newx = (unsigned int)timeGetTime();//GetTickCount();
    nattempts++;
    if( newx != x ) {
      unsigned int delta = newx - x;
      printf(" delta = %u totalreads = %llu\n",
       delta,nattempts);
      x = newx;
      i++;
    }    
  }

  if( timeEndPeriod(1) == TIMERR_NOERROR ) printf("ended timer\n");

  return 0;
}
/*

C:\test>ticktock
set timer
delta = 1 totalreads = 6731
delta = 35 totalreads = 6732
delta = 36 totalreads = 6733
delta = 36 totalreads = 6734
delta = 36 totalreads = 6735
delta = 36 totalreads = 6736
delta = 37 totalreads = 6737
delta = 36 totalreads = 6738
delta = 36 totalreads = 6739
delta = 36 totalreads = 6740
delta = 36 totalreads = 6741
delta = 36 totalreads = 6742
delta = 38 totalreads = 6743
delta = 36 totalreads = 6744
delta = 36 totalreads = 6745
delta = 36 totalreads = 6746
delta = 36 totalreads = 6747
delta = 37 totalreads = 6748
delta = 36 totalreads = 6749
delta = 36 totalreads = 6750
delta = 36 totalreads = 6751
delta = 36 totalreads = 6752
delta = 36 totalreads = 6753
delta = 37 totalreads = 6754
delta = 35 totalreads = 6755
delta = 36 totalreads = 6756
delta = 36 totalreads = 6757
delta = 36 totalreads = 6758
delta = 36 totalreads = 6759
delta = 37 totalreads = 6760
delta = 36 totalreads = 6761
delta = 36 totalreads = 6762
delta = 36 totalreads = 6763
delta = 36 totalreads = 6764
delta = 36 totalreads = 6765
delta = 36 totalreads = 6766
delta = 36 totalreads = 6767
delta = 36 totalreads = 6768
delta = 36 totalreads = 6769
delta = 36 totalreads = 6770
delta = 36 totalreads = 6771
delta = 37 totalreads = 6772
delta = 36 totalreads = 6773
delta = 36 totalreads = 6774
delta = 36 totalreads = 6775
delta = 36 totalreads = 6776
delta = 36 totalreads = 6777
delta = 36 totalreads = 6778
delta = 36 totalreads = 6779
delta = 36 totalreads = 6780
ended timer

C:\test>

*/
mar
Posts: 2659
Joined: Fri Nov 26, 2010 2:00 pm
Location: Czech Republic
Full name: Martin Sedlak

Re: microsecond-accurate timing on Windows

Post by mar »

diep wrote: Are you sure about timeGetTime()?

It has 100% same behaviour like GetTickCount() here.
Look:

Code: Select all

#include <windows.h>
#include <Mmsystem.h>
#include <stdio.h>

#define BITBOARD unsigned long long

int main(void) {
  unsigned long long nattempts = 0ULL;
  unsigned int x, i = 0, dummy;


  if( timeBeginPeriod(1) == TIMERR_NOERROR ) 
    printf("set timer\n");
  else
    printf("error setting timer\n");
  
  x = (unsigned int)timeGetTime();//GetTickCount();   
  while( i < 50 ) {
    unsigned int newx = (unsigned int)timeGetTime();//GetTickCount();
    nattempts++;
    if( newx != x ) {
      unsigned int delta = newx - x;
      printf(" delta = %u totalreads = %llu\n",
       delta,nattempts);
      x = newx;
      i++;
    }    
  }

  if( timeEndPeriod(1) == TIMERR_NOERROR ) printf("ended timer\n");

  return 0;
}
That looks strange indeed. I see you even removed Sleep.
May I ask what hardware and OS are you using?

What bothers me are the first two lines of your output
diep wrote: delta = 1 totalreads = 6731
delta = 35 totalreads = 6732
It looks like the first line is ok, after that it seems that the system was flooded and couldn't handle that because after that it's 35 msec for one read. So a possible solution would be to try a higher number than 1, perhaps 5 might work. I have read that timeBeginPeriod affects task scheduler.

Output of your code on my machine:

set timer
delta = 15 totalreads = 552126
delta = 16 totalreads = 574114
delta = 1 totalreads = 621069
delta = 1 totalreads = 673059
delta = 1 totalreads = 724305
delta = 1 totalreads = 775589
delta = 1 totalreads = 827283
delta = 1 totalreads = 878988
delta = 1 totalreads = 930644
delta = 1 totalreads = 982389
delta = 1 totalreads = 1034043
delta = 1 totalreads = 1085748
delta = 1 totalreads = 1188198
delta = 1 totalreads = 1239628
delta = 1 totalreads = 1291261
delta = 1 totalreads = 1342950
delta = 1 totalreads = 1394656
delta = 1 totalreads = 1446298
delta = 1 totalreads = 1497668
delta = 1 totalreads = 1549352
delta = 1 totalreads = 1601062
delta = 1 totalreads = 1652708
delta = 1 totalreads = 1704386
delta = 1 totalreads = 1754979
delta = 1 totalreads = 1806627
delta = 1 totalreads = 1858327
delta = 1 totalreads = 1910033
delta = 1 totalreads = 1961680
delta = 1 totalreads = 2013385
delta = 1 totalreads = 2065071
delta = 1 totalreads = 2114332
delta = 1 totalreads = 2165828
delta = 1 totalreads = 2217913
delta = 1 totalreads = 2269112
delta = 1 totalreads = 2320699
delta = 1 totalreads = 2366978
delta = 1 totalreads = 2419002
delta = 1 totalreads = 2471064
delta = 1 totalreads = 2523128
delta = 1 totalreads = 2574522
delta = 1 totalreads = 2626588
delta = 1 totalreads = 2678655
delta = 1 totalreads = 2730723
delta = 1 totalreads = 2782369
delta = 1 totalreads = 2833792
delta = 1 totalreads = 2884875
delta = 1 totalreads = 2936605
delta = 1 totalreads = 2987862
delta = 1 totalreads = 3039198
delta = 1 totalreads = 3090884
ended timer

It means that it takes some time before timeBeginPeriod takes effect. Shouldn't be an issue though.
mar
Posts: 2659
Joined: Fri Nov 26, 2010 2:00 pm
Location: Czech Republic
Full name: Martin Sedlak

Re: microsecond-accurate timing on Windows

Post by mar »

diep wrote: Yes but in graphics they use other trick for this. Render everything behind visibility of user and then put entire screen at once to visible display. So no problems with eyes then from interlaced screen.
Yes that's double or triple buffering. And if vertical synchronization is on, no tearing occurs when flipping (one of) the backbuffers. The only problem occurs if you can't render the frame in time: Like having monitor refresh rate 60Hz. If you can supply frames say only 59 times a second, with vsync enabled your framerate suddenly drops to half because you miss one vsync. And while 30 fps is perfectly playable it's not smooth.
So to improve performance, vsync is sometimes turned off. But then the tearing effect you described appears beause you flip the buffers somewhere in the middle of the refresh. So without vsync one gets 59 fps instead of 30 but the eyes will hurt.
I once almost fried a GPU when I was getting over 1000 fps without vsync, fortunately I heard it squeaking so Alt+F4 followed immediately.
Since then I have added a framerate limiter.
So vsync on not only avoids tearing but also keeps GPU from being fried :)
Aleks Peshkov
Posts: 895
Joined: Sun Nov 19, 2006 9:16 pm
Location: Russia
Full name: Aleks Peshkov

Re: microsecond-accurate timing on Windows

Post by Aleks Peshkov »

Windows timers is not for video processing. Display lag.
mar
Posts: 2659
Joined: Fri Nov 26, 2010 2:00 pm
Location: Czech Republic
Full name: Martin Sedlak

Re: microsecond-accurate timing on Windows

Post by mar »

Aleks Peshkov wrote:Windows timers is not for video processing. Display lag.
You need to measure time between frames to update game logic. LCD display lag is totally irrelevant here.
diep
Posts: 1822
Joined: Thu Mar 09, 2006 11:54 pm
Location: The Netherlands

Re: microsecond-accurate timing on Windows

Post by diep »

hi, I tested it at windows vista ultimate. installed at dual opteron dual core 2.4Ghz which is my development machine currently, though i intend to switch to one of the newer Xeon systems i have here.

So most important thing: 2 cpu system. That it's opteron or one of the many Xeon systems here of cluster is not so relevant. Same problem.
diep
Posts: 1822
Joined: Thu Mar 09, 2006 11:54 pm
Location: The Netherlands

Re: microsecond-accurate timing on Windows

Post by diep »

mar wrote: [snip]
It means that it takes some time before timeBeginPeriod takes effect. Shouldn't be an issue though.
Yes i already suspected you would have this on your XP machine. Remember: XP is no longer supported. It still is DOS to some extend.

So whatever you do based upon measurement at your box won't work at anyones machine except when they're single cpu and XP.

Don't program like that :)

I have seen so much software that worked 'ok' at a single cpu machine and ugs out at 2 cpu machines. That's not only with Vista. I saw that with NT4.0 i saw that with win2000 as well :)

Basically all my pc's are multiple cpu's of course as that's always faster for chess. My new system is a 8 node cluster Xeon L5420. So that's 2 cpu's in each nodes. 64 cores in total.

You can pickup such nodes on ebay for dirt cheap prices, which is what i did do. $200 or so a machine. No machine of $200 can rival their speed...
(so building a 64 core cluster is CHEAP if you know how to program)

Behaviour is *always* different from single cpu, though in this case i don't know whether single CPU and Vista has the same behaviour as well. Maybe ask someone here who has a single CPU system and Vista/Win7...

Making mass market software simply is a lot harder than custom built software for clients - because mass market software has to run everywhere :)

Oh as for the operating systems. If you dive deeper into it. both linux as well as the windows kernel are SO OUTDATED. Basically it's single CPU systems. Just total antique shjt it is.