Accessing memory

Discussion of chess software programming and technical issues.

Moderator: Ras

User avatar
Kempelen
Posts: 620
Joined: Fri Feb 08, 2008 10:44 am
Location: Madrid - Spain

Accessing memory

Post by Kempelen »

Hi,
I have read that global variables are slow access, as the processor don't store in the L2 cache. I have been thinking on it, but don't know exactly how to treat this issue. Most optimization techniques web pages I seen only give tips for speed code saving cycles, but none talk about accessing memory in an efficient way. I think this problem is not only for global variables, but all.
How can I address this problem? what are faster ways to access data.....
thx.
Best regards,
FS
User avatar
hgm
Posts: 28353
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Accessing memory

Post by hgm »

It seems you have been reading nonsense. Every memory acces is cached in L2.
wgarvin
Posts: 838
Joined: Thu Jul 05, 2007 5:03 pm
Location: British Columbia, Canada

Re: Accessing memory

Post by wgarvin »

Kempelen wrote:Hi,
I have read that global variables are slow access, as the processor don't store in the L2 cache. I have been thinking on it, but don't know exactly how to treat this issue. Most optimization techniques web pages I seen only give tips for speed code saving cycles, but none talk about accessing memory in an efficient way. I think this problem is not only for global variables, but all.
How can I address this problem? what are faster ways to access data.....
thx.
Best regards,
FS
(1) If you know the address you want to read from well in advance of actually needing the results of the read, you can issue a "prefetch" instruction, which is just like a read, except nothing bad happens if it fails (e.g. if you try to prefetch from an invalid address, it will silently do nothing instead of crashing your program). For x86, you can use the a compiler intrinsic like "_mm_prefetch" to generate such a prefetch.

(2) I think accessing large amounts of data sequentially will trigger automatic prefetching on x86-based chips.

(3) If the addresses you need to read from are not known until right when you want to do the read, you can ask yourself if they are likely to hit the cache or miss. If misses are likely, you could try to put a prefetch instruction and then do some other computations before doing the actual read (thus "hiding the latency" of the miss is being serviced, i.e. you are doing something useful during that time instead of waiting doing nothing).

In more detail... L1 d-cache is probably 32 KB or 64 KB; L2 cache is probably 512 KB or 1 MB. So if you are doing hundreds of accesses to a very small table (1-4 KB) there is a good chance the whole table will be in the L1 cache by the time you're done. If you're doing thousands of accesses to a 10 KB table, same thing. If you're doing thousands of accesses to a 200 KB table, it will not all fit in the L1 cache but it will probably be mostly or entirely in the L2 cache by the time you're done.

On the other hand, If you're doing lots of random accesses to a *large* table (such as a 100+ MB transposition table), you're basically going to get an L2 cache miss 99% of the time. Knowing that, you could change your TT access to "prefetch the address of the entry; do some other stuff; now actually read the entry". As I recall Gerd has done this in his engine, moving some move generation computations in there. You'd have to experiment to find the right amount of work to do between the prefetch and the read, but it is probably a couple hundred cycles' worth?
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Accessing memory

Post by bob »

Kempelen wrote:Hi,
I have read that global variables are slow access, as the processor don't store in the L2 cache. I have been thinking on it, but don't know exactly how to treat this issue. Most optimization techniques web pages I seen only give tips for speed code saving cycles, but none talk about accessing memory in an efficient way. I think this problem is not only for global variables, but all.
How can I address this problem? what are faster ways to access data.....
thx.
Best regards,
FS
That's incorrect information. No idea where it came from but it is wrong. Global or local variables go through L1/L2 cache...
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Accessing memory

Post by bob »

hgm wrote:It seems you have been reading nonsense. Every memory acces is cached in L2.
Well, not quite "every" as caching can be selectively disabled... Has to be for devices that use memory-mapped I/O in fact. But for global/local variables his info is definitely bad...
User avatar
hgm
Posts: 28353
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Accessing memory

Post by hgm »

Sure. But I do not consider memory-mapped devices 'memory', and even if we would, the OS would not give you access to them. The CPU has memry type and range registers that allow the OS to disable caching even on selected DRAM areas, but I would be surprised if it would ever do so for any user-mapped memory.