Don wrote:bob wrote:I don't see how it simplifies the split point code at all. All it does is get rid of Unmake() for a gain, but adds the copy during the make process for a loss. But affecting the split point code? I don't see it. You certainly have to replicate everything for each thread...
The first time I started using copy-make it was for the idea of simplifying the code, getting rid of global variables and avoiding the complexities and extra conditional branches involved in unmaking a move. It's often, but not always, the case that simplifying the code comes with many advantages including performance.
I usually try to determine using logic and my own intuition what should happen if I do things a certain way but I have found that my intuition is a very poor guide. So I am very careful to not make any assumptions no matter how obvious they seem to me or how much the fly in the face of logic. Of course one still has to use reason as a guide.
So my first gut feeling on this issue was that it was a horrible idea. At MIT I was convinced to try it as I was given a lecture on cache sizes, that copying a few hundred bytes on modern CPU's was nothing compared to things like missed branches and missed cache hits and other consideration. So when I stopped putting undue weight on the speed of copying state, the only thing left was win, win, win. But of course the only thing that really mattered to me was whether it really worked or not.
There are only 2 potential downsides to copy-make. Possible cache blowout issues and the speed of copying. The upsides to copy-make are:
1. much reduced logic (which means less missed branches)
2. great simplification of code. (which can be somewhat mitigated with good design)
3. easier to make parallel code.
4. No dealing with unmake and it's complexities.
5. less data structures to support unmaking a move.
None of these things are major - with good design I don't think a program needs to have copy-make and can still be bug free and easy to work with - but I am a firm believer in keeping things as simple as possible (but no simpler.) Sometimes simple gets in the way and you need a little more complexity, but I have concluded that this is not one of those cases.
I once measured copy-make and I think the program spends 2% of the time there now. (At one time Doch spend about 3% of the time, but it's not clear to me how accurate these measurement are. That does not make Komodo 2% slower than it has to be, because making and unmaking a move would also have overhead. I don't really know if make/unmake would give me a speedup, but the upper bound on the potential speedup is 2 percent and my estimate of the true difference is zero. It's possible that copy-make is faster, but I'm not claiming that. I don't really know.
So I did an experiment in order to test the theory that copy state is blowing out the cache and slowing the program to a crawl - which seems to be the big fear that is being put out there. Before getting to that, I want to mention that most good chess programs use a number of vary large data structures and that storing a few position states is a trivial amount of space in comparison.
Komodo has a position state that is 208 bytes. It used to be 192 bytes for alignment purposes but it has grown - but I can get it back down to 192 easily enough. But that is besides the point.
So the experiment is to see how much slower the program will get if I double the size of the position state. I added a character array the middle of the state struct definition that is also 208 bytes large and I did not even bother to align it in any special way but I doubt that matters in this case as it is on a 4 byte boundary anyway. Then I doubled the size of that to get 3x larger state that is normal in Komodo.
I ran 3 timings (50 position to 13 ply) and took the median value as the time.
When I doubled the state size, the program slowed down by 0.7 percent, that's less than 1 percent but non-trivial. When I tripled the state size it slowed down by 1.4 percent compared to the regular version. So the state size appears to matter and is worthy of consideration. Note that by doubling and tripling the state size I have to suffer the additional cache overhead as well as the state copy time.
It's difficult to draw any conclusions from this because I don't know how much unmake and the associated logic and data structures to support it would slow things down. I would point out that make is also a little more complicated without copy-make - so the question is whether make/unmake is a major win over copy-make. I seriously doubt that it is. I would guess that with a relatively small state size it's a non-issue. If I thought it was more than 2 percent I would go through the pain of changing the program but I doubt it's a slowdown at all.
I know this is not proof of anything, but Komodo manages to be a very strong program using this scheme but as far as I know it's the only program that does this so perhaps I am missing something big that would propel us over the level of Houdini if I change it.
It would not be too difficult to convert my program - I could keep everything the same but have a special routine that makes and unmakes a move to a target state. However, guess what? Now I would have to add more code and global variables in order to maintain the information to undue a move and do repetition testing. In other words you STILL have to maintain state, now it's just more complicated.