This idea popped into my ahead as I was waking up this morning. I got to wondering if stack memory alignment is an issue for chess programs?
BTW, I looked up this type of topic on the web and this seems to be an area of concern that includes methods that have been patented.
Stack Memory Alignment an Issue?
Moderator: Ras
-
- Posts: 348
- Joined: Sat Feb 27, 2010 12:21 am
Re: Stack Memory Alignment an Issue?
rjgibert wrote:This idea popped into my ahead as I was waking up this morning. I got to wondering if stack memory alignment is an issue for chess programs?
BTW, I looked up this type of topic on the web and this seems to be an area of concern that includes methods that have been patented.
It matters, depending on CPU. Note that your operating system may inject random run to run offsets that make it hard to measure the effect at all.

-
- Posts: 838
- Joined: Thu Jul 05, 2007 5:03 pm
- Location: British Columbia, Canada
Re: Stack Memory Alignment an Issue?
Not sure why it would do that (perhaps for some DEP/randomization/anti-stack-smashing reason?) You could counteract the effect by manually aligning your stack in your own function that gets called from main and runs your whole engine inside of it. E.g. you could align it on a 128-byte or 4096-byte boundary, if you wanted to. That code will be non-portable (probably needing inline assembly, etc.) but you could make it a part of your portability layer for each platform. If you spawn new threads, you might want to align their stacks as well.marcelk wrote:It matters, depending on CPU. Note that your operating system may inject random run to run offsets that make it hard to measure the effect at all.
To do this, you'd write a function in assembly for each platform, that takes two args: one void* and a function pointer void (*foo)(void*). (and maybe a third to specify the alignment you want?) It would align the stack as desired, then it would call the function foo with the void* argument. Now you can use this in your main and in each worker thread, to align them to a known boundary before they start doing stuff. Actually I just thought of a fourth parameter you might want: offset within the aligned block.
I'm not sure how you'd determine the optimal values for alignment size and offset, it would probably depend on where your engine's frequently-accessed data structures are such as board state, certain lookup tables, etc. You could compile a bunch of executables with different values and test them to see which is fastest, but it would probably have to be the last thing you do before a release.
It all sounds like quite a lot of work for a pretty minor gain though.
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: Stack Memory Alignment an Issue?
I'm not even sure your measurements are valid. Each time you run an executable, it is loaded into a different set of real pages in memory, which will then map into cache differently, with different aliasing issues. Too many physical pages matching in the low order N bits will map to the same cache set, and cause thrashing. next time you might get a better balance of memory to cache and run a little faster.marcelk wrote:rjgibert wrote:This idea popped into my ahead as I was waking up this morning. I got to wondering if stack memory alignment is an issue for chess programs?
BTW, I looked up this type of topic on the web and this seems to be an area of concern that includes methods that have been patented.
It matters, depending on CPU. Note that your operating system may inject random run to run offsets that make it hard to measure the effect at all.
[/img]
-
- Posts: 348
- Joined: Sat Feb 27, 2010 12:21 am
Re: Stack Memory Alignment an Issue?
Every point is the best of 10 measurements and the shape of the graph is repeatable. The measurement is valid. I would not have provided it otherwise.bob wrote:
I'm not even sure your measurements are valid. Each time you run an executable, it is loaded into a different set of real pages in memory, which will then map into cache differently, with different aliasing issues. Too many physical pages matching in the low order N bits will map to the same cache set, and cause thrashing. next time you might get a better balance of memory to cache and run a little faster.
-
- Posts: 778
- Joined: Sat Jul 01, 2006 7:11 am
Re: Stack Memory Alignment an Issue?
I think the scale distorts the size of the effect. If I am reading the graph correctly, the speed difference is about 1%.marcelk wrote:rjgibert wrote:This idea popped into my ahead as I was waking up this morning. I got to wondering if stack memory alignment is an issue for chess programs?
BTW, I looked up this type of topic on the web and this seems to be an area of concern that includes methods that have been patented.
It matters, depending on CPU. Note that your operating system may inject random run to run offsets that make it hard to measure the effect at all.
[/img]
-
- Posts: 348
- Joined: Sat Feb 27, 2010 12:21 am
Re: Stack Memory Alignment an Issue?
You could switch randomization off instead of counter-acting, if you have access to the kernel.wgarvin wrote:Not sure why it would do that (perhaps for some DEP/randomization/anti-stack-smashing reason?) You could counteract the effect by manually aligning your stack in your own function that gets called from main and runs your whole engine inside of it.marcelk wrote:It matters, depending on CPU. Note that your operating system may inject random run to run offsets that make it hard to measure the effect at all.
Brute-force works well. Just one run overnight with all combinations of interest.I'm not sure how you'd determine the optimal values for alignment size and offset,
It is relatively simple compared to other aspects of engine tuning, but the gain is low. On the other hand, several such things combined may yield another elo point. As said, not all CPU types have a similar response function so YMMV.It all sounds like quite a lot of work for a pretty minor gain though.
-
- Posts: 348
- Joined: Sat Feb 27, 2010 12:21 am
Re: Stack Memory Alignment an Issue?
1% is above my threshold for accepting speed improvements.jwes wrote:If I am reading the graph correctly, the speed difference is about 1%.
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: Stack Memory Alignment an Issue?
Sounds better. I didn't see the "average of 10 runs for each" mentioned.marcelk wrote:Every point is the best of 10 measurements and the shape of the graph is repeatable. The measurement is valid. I would not have provided it otherwise.bob wrote:
I'm not even sure your measurements are valid. Each time you run an executable, it is loaded into a different set of real pages in memory, which will then map into cache differently, with different aliasing issues. Too many physical pages matching in the low order N bits will map to the same cache set, and cause thrashing. next time you might get a better balance of memory to cache and run a little faster.