perft and nps speed

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

Engin
Posts: 918
Joined: Mon Jan 05, 2009 7:40 pm
Location: Germany
Full name: Engin Üstün

perft and nps speed

Post by Engin »

when i testing perft on my engine that calculates near 12 million nps !!!

but why is this speed so slow by normal search ?

what functions do this so much slow down?

then i make a search perft with near the same things similar to search but without null move, hash, prunings, internal iterative, qsearch

i get the same nodes then its ok, but its also very fast about near 2 million nps.

comparing to normal search its 2x faster

i see that Crafty with only 1 cpu reach this speed about 2 million nps, my engine can only max. 1 million nps.

then i try to do magic bitboards ??

because i heart that crafty using magic bitboards since version 23.0 ???

or is there other tricks to make bitboards 2x faster ???
User avatar
rvida
Posts: 481
Joined: Thu Apr 16, 2009 12:00 pm
Location: Slovakia, EU

Re: perft and nps speed

Post by rvida »

Engin wrote:
i see that Crafty with only 1 cpu reach this speed about 2 million nps, my engine can only max. 1 million nps.

then i try to do magic bitboards ??

because i heart that crafty using magic bitboards since version 23.0 ???
IIRC Robert mentioned that he saw no performance increase from implementing magic bitboards in Crafty, they are just more elegant and easier to use (no need to maintain rotated bb's in make() and unmake()).

Hash table lookup is a good example of a potential performance bottleneck, as it is almost 100% a cache miss, even TLB page miss, which are very costly. If you are using 4-entry hash cluster, make sure that all 4 entries are in one cache line. Also not probing in QS helps with NPS - at the cost of somewhat larger trees.
Engin
Posts: 918
Joined: Mon Jan 05, 2009 7:40 pm
Location: Germany
Full name: Engin Üstün

Re: perft and nps speed

Post by Engin »

i found that the eval consume the most time of speed in my sperft !

i removed some code from eval and get faster speed!

so the problem should be in the evaluate ??
Engin
Posts: 918
Joined: Mon Jan 05, 2009 7:40 pm
Location: Germany
Full name: Engin Üstün

Re: perft and nps speed

Post by Engin »

rvida wrote:
Engin wrote:
i see that Crafty with only 1 cpu reach this speed about 2 million nps, my engine can only max. 1 million nps.

then i try to do magic bitboards ??

because i heart that crafty using magic bitboards since version 23.0 ???
IIRC Robert mentioned that he saw no performance increase from implementing magic bitboards in Crafty, they are just more elegant and easier to use (no need to maintain rotated bb's in make() and unmake()).

Hash table lookup is a good example of a potential performance bottleneck, as it is almost 100% a cache miss, even TLB page miss, which are very costly. If you are using 4-entry hash cluster, make sure that all 4 entries are in one cache line. Also not probing in QS helps with NPS - at the cost of somewhat larger trees.
yes, its true, if i do not use hash in quiesce i get more speed

but why in the stockfish code using hash loopup in quiesce ??
tvrzsky
Posts: 128
Joined: Sat Sep 23, 2006 7:10 pm
Location: Prague

Re: perft and nps speed

Post by tvrzsky »

Engin wrote:i found that the eval consume the most time of speed in my sperft !

i removed some code from eval and get faster speed!

so the problem should be in the evaluate ??
For me the evaluatiuon is definitely the most time consuming part of the search varying from (do not remember exact numbers) 25% (in wild positions with lot of material imbalances when I do not do complete evaluation) to 70% (in quiet ones). So you should look for the ways how to evaluate as little as possible.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: perft and nps speed

Post by bob »

Engin wrote:when i testing perft on my engine that calculates near 12 million nps !!!

but why is this speed so slow by normal search ?

what functions do this so much slow down?

then i make a search perft with near the same things similar to search but without null move, hash, prunings, internal iterative, qsearch

i get the same nodes then its ok, but its also very fast about near 2 million nps.

comparing to normal search its 2x faster

i see that Crafty with only 1 cpu reach this speed about 2 million nps, my engine can only max. 1 million nps.

then i try to do magic bitboards ??

because i heart that crafty using magic bitboards since version 23.0 ???

or is there other tricks to make bitboards 2x faster ???
Crafty has been using them quite a while. 2 years or so if I recall, so prior to 23.0. But what is really important is that you use a good profiler to see where you are spending your time. There is little to be gained by guessing what is slowing you down and trying to speed that up. Learn how to use your profiler and find out exactly where your time is going and then speed up the biggest time-user...
Engin
Posts: 918
Joined: Mon Jan 05, 2009 7:40 pm
Location: Germany
Full name: Engin Üstün

Re: perft and nps speed

Post by Engin »

sure it is the eval that most costly speed
and i use lazy eval.

now my Tornado is about 1000 kn/s with 1cpu :)

if i can reach over 2000 kn/s as like crafty performance then i am happy...

i know that speed is not all of gain strength but both of speed and chess knowledge is better i think.

next i will try how slow down the hashes doing, then maybe can do this faster

i do not return hash score , only testing the save/lookup speed.
Engin
Posts: 918
Joined: Mon Jan 05, 2009 7:40 pm
Location: Germany
Full name: Engin Üstün

Re: perft and nps speed

Post by Engin »

hi Robert !,

many thanks for you hints :)

i dont have a profiler, i compare the speed everytime if i change something.
and of course playing some games how gain in strength.

i am really respect the speed of crafty its have with only 1cpu, afraid if using more threads :(.

but some test matches Tornado beats Crafty :) in blitz games and with 1cpu

i think Tornado is now only about -40 behind Crafty.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: perft and nps speed

Post by bob »

Engin wrote:hi Robert !,

many thanks for you hints :)

i dont have a profiler, i compare the speed everytime if i change something.
and of course playing some games how gain in strength.

i am really respect the speed of crafty its have with only 1cpu, afraid if using more threads :(.

but some test matches Tornado beats Crafty :) in blitz games and with 1cpu

i think Tornado is now only about -40 behind Crafty.
If you are going to worry about speed, you _must_ have a profiler. Otherwise, how can you possibly tell where to start with your optimization efforts? You have to start on the piece of code that is burning the most CPU cycles. How can you identify that code without a profiler? I've not seen a C compiler that doesn't have this option included. But you can always download gcc or the windows version of it and use it for profiling, even if you don't want to use it for the "production compile". It is an essential tool for optimizing code to run fast.
Dann Corbit
Posts: 12537
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: perft and nps speed

Post by Dann Corbit »

Engin wrote:hi Robert !,

many thanks for you hints :)

i dont have a profiler, i compare the speed everytime if i change something.
and of course playing some games how gain in strength.

i am really respect the speed of crafty its have with only 1cpu, afraid if using more threads :(.

but some test matches Tornado beats Crafty :) in blitz games and with 1cpu

i think Tornado is now only about -40 behind Crafty.
You could use gprof (free) to find the hot spots. I guess that they won't change very much from compiler to compiler