More about NPS...

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

User avatar
Kempelen
Posts: 620
Joined: Fri Feb 08, 2008 10:44 am
Location: Madrid - Spain

More about NPS...

Post by Kempelen »

This is a quote from Dr.Hyatt from another thread. The topic was about NPS.

Code: Select all

It is very difficult to compare. Crafty, for example, on a single core-2 CPU at 2.0ghz runs about 2.5M nodes per second, but it can go up significantly in the right kinds of positions. It depends on how clever you were with data structures, how much (or how little) memory traffic you produce, how complex or simple your evaluation is, etc...
I program my engine in C, and know many things about programming, code optimizations and so on, but I dont know what is a good data structure (in terms of speed), or how to generate low traffic. This sound to my like machine-type optimizations. Is there rules of thumb about how to write and use good structrures and generate low traffic?
Fermin Serrano
Author of 'Rodin' engine
http://sites.google.com/site/clonfsp/
ldesnogu

Re: More about NPS...

Post by ldesnogu »

As far as "generating low traffic" goes, you need to get a clear understanding of how a cache works. Once you clearly understand that, you can get to TLB issues. And when you'll switch to multi-threading, you'll have to learn about cache line sharing and other things related to the various coherency protocols.

These things are not only machine-dependent (though some "details" such as cache line size, and cache size obviously are).

As Terje Mathisen has said for many years: "almost all programming can be viewed as an exercise in caching" :)
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: More about NPS...

Post by bob »

Kempelen wrote:This is a quote from Dr.Hyatt from another thread. The topic was about NPS.

Code: Select all

It is very difficult to compare. Crafty, for example, on a single core-2 CPU at 2.0ghz runs about 2.5M nodes per second, but it can go up significantly in the right kinds of positions. It depends on how clever you were with data structures, how much (or how little) memory traffic you produce, how complex or simple your evaluation is, etc...
I program my engine in C, and know many things about programming, code optimizations and so on, but I dont know what is a good data structure (in terms of speed), or how to generate low traffic. This sound to my like machine-type optimizations. Is there rules of thumb about how to write and use good structrures and generate low traffic?
"temporal and spatial locality." If two variables in memory are used close together with respect to time (temporal locality) then they should be located close together in memory (spatial locality) so that when you reference the first, and cache fetches the block of memory that contains that variable, it will also fetch the second variable at the same time since it fetches 64 bytes (or 128 on PIV, or 256 on other architectures besides Intel/AMD) at one time. Accessing the second variable will have no measurable latency since it is effectively prefetched.