Repetition detection structure.

bob · Post by **bob** » Mon Feb 25, 2008 8:27 pm

hgm wrote:It seems the main difference then is that you go all the way back to the root all the time, to swicth split point, while I just let them migrate to the first splitpoint they encounter.

My idea was that the search almost entirely consists of refutation trees, where every all-node is a possible split point. So typically, there will be a split point at each even or odd level. So as soon as you run out of unsearched moves in the splitpoint you are currently in, just follow move two levels closer to the root, or two levels deeper in the tree, and you are likely to find another one. So why backup all the way to the root, only to return to nearly the same spot in the tree?

The problem is you are violating the alpha/beta principle. The current node has not been completed, so we have no score to back up yet, yet you are going to back up two plies and start searching at a node with no idea of what alpha/beta should be until the other stuff below this node has been finished. The tree size blows up, speedup drops to near zero, and it just doesn't work that way. Using the hash table is simply a poor way of doing a parallel search (it has already been tried many times and found wanting), particularly with more than two processors...

Zach Wegner · Post by **Zach Wegner** » Mon Feb 25, 2008 9:30 pm

hgm wrote:It seems the main difference then is that you go all the way back to the root all the time, to swicth split point, while I just let them migrate to the first splitpoint they encounter.

My idea was that the search almost entirely consists of refutation trees, where every all-node is a possible split point. So typically, there will be a split point at each even or odd level. So as soon as you run out of unsearched moves in the splitpoint you are currently in, just follow move two levels closer to the root, or two levels deeper in the tree, and you are likely to find another one. So why backup all the way to the root, only to return to nearly the same spot in the tree?

But the overhead in such an approach will greatly increase. I have maybe ~20 splits per second on a dual (though this varies a lot, it's always double digits), and the number of cycles spent making and unmaking is simply nothing.

The main problem plaguing parallel search is overhead: searching unnecessary nodes. It seems your approach, and really all shared hash table approaches, will suffer badly from this, though yours much less so than the others. Because there is no real synchronization as there is in my search, you can never really be sure about searching duplicate branches, or returning values from incompletely searched nodes. Bob is right about this, you cannot just "back up" in a tree and keep searching. You have to have a master processor that cannot return without searching _everything_. But all other processors can really do anything, as long as the values they put in the hash are accurate.

One more drawback of your approach: it relies heavily on shared memory. Mine does too ATM, but I would like to eventually move to distributed environments. Your approach is very fine-grained, and many processors imply that there will be more and more duplicated work.

On a side note, Vincent said recently that if he could get 500 50 MHz cpus at a reasonable price, he would prefer it over a typical PC, just to improve his speedup. I would love such an opportunity, and luckily, I think we will see CPUs with such numbers of cores on one die in the not-too-distant future.

bob · Post by **bob** » Mon Feb 25, 2008 9:31 pm

Uri Blass wrote:
bob wrote:
mathmoi wrote:Hi,

I examined Crafty and Fruit code and found that they both use a simple array (accessed like a stack) to store the hash keys for the repetition detection code.

I understand that most of the time only a couple of positions needs to be verified (up to the last irreversible move), but in the case where we approach the 50 moves limit, we will need to scan most of the array at every nodes.

Is it possible to use something else, like a small hash table or event a binary tree?

I guess that what I'm asking is. Is there other ways to look for repetition detection or does everyone use a simple array?

MP
A hash table is doable, but there is an efficiency issue. When you start a node, you have to enter the position, then when you leave the node, you have to remove the position. There are other sticky issues dealing with SMP search, in the repetition hash table has to be a reasonable size to prevent overwriting which would be fatal, and it has to somehow handle an SMP search which makes duplicating it expensive...

repetition list scanning is not a big issue, once you profile an engine to see how little time it actually spends in there, particularly since you don't even probe it in q-search...
The last part is correct only for q-search that include only captures but many programmers include also checks and replies to check in the qsearch
I still do not plan to care about more efficient repetition detection
(at least not in the near future) and cases when there are many moves with no conversion are not the majority of cases in games.

Uri

OK. Although I generally call that something other than q-search. In Cray Blitz, the search was divided into three parts. The normal first N plies. Then the next 4 plies where checks and check escapes were searched along with captures, promotions, and a few special-case moves like capturing next to the king, and then we dropped into a normal q-search that just did captures. So you could say we did checks in the q-search, or not, depending on the semantic use of q-search.

However, that notwithstanding, it would be a waste to have a generalized Quiesce() function that both does captures and checks, where the checks are somehow limited to the first few plies, because it would either need a test to skip the repetition check once you go beyond the check depth limit, or else you will do repetition checks after each capture which is certainly a waste. In CB, (and in early crafty versions which copied this approach) there were three different procedures used. Search(), then Extend(), and finally Quiesce(). Where most would consider Extend() (captures + checks + check escapes) and Quiesce() (captures only) to be the q-search. I was thinking a bit more specifically there...

hgm · Post by **hgm** » Mon Feb 25, 2008 11:09 pm

Well, this forum has proved not to be a good place to discuss such things. Someone asked a question, and I answered it. As far as I am concerned, everything you interject is just noise. By now it should be clear to everyone that it is all beyond you, so there isn't really any need to add anything to what I have already said. I know how to add and divide. You can just muddle on in the swamp of your own misconceptions, but I have better things to do. Good luck!

bob · Post by **bob** » Tue Feb 26, 2008 12:27 am

hgm wrote:Well, this forum has proved not to be a good place to discuss such things. Someone asked a question, and I answered it. As far as I am concerned, everything you interject is just noise. By now it should be clear to everyone that it is all beyond you, so there isn't really any need to add anything to what I have already said. I know how to add and divide. You can just muddle on in the swamp of your own misconceptions, but I have better things to do. Good luck!

Yep, you are always right, and the hundreds of books on software engineering and optimization guides all have it dead wrong...

just keep on doing whatever it is you do in your own private world... where the usual laws of good software development practices do not work...

BTW, here is my original "noise", and I stand by _every_ word as written, and the data as presented...

==========================================================
here is what you have to beat:

time seconds seconds calls s/call s/call name
0.51 205.48 1.07 113029036 0.00 0.00 RepetitionCheck

I played a few pretty quick games, and picked the first one that ended by the 50 move rule, which is likely the worst-case for repetition list scanning.

RepetitionCheck() accounted for 0.51% of the total cpu time used (1/2 percent) which is pretty insignificant...

here is everything from most expensive down to RepetitionCheck():
Code:

% cumulative self self total
time seconds seconds calls s/call s/call name
16.93 35.40 35.40 499704723 0.00 0.00 Evaluate
8.60 53.38 17.98 563432081 0.00 0.00 MakeMove
8.24 70.61 17.24 802244362 0.00 0.00 EvaluatePassedPawns
8.12 87.59 16.98 115810 0.00 0.00 Search
7.62 103.52 15.93 1006239194 0.00 0.00 Attacked
6.11 116.30 12.79 371224516 0.00 0.00 Quiesce
6.10 129.06 12.76 339036879 0.00 0.00 Swap
4.73 138.96 9.90 116047645 0.00 0.00 GenerateCaptures
4.57 148.52 9.56 563432081 0.00 0.00 UnmakeMove
3.75 156.35 7.84 112735079 0.00 0.00 HashProbe
2.74 162.09 5.74 413850292 0.00 0.00 NextMove
2.73 167.80 5.72 49404108 0.00 0.00 EvaluateMobility
2.66 173.36 5.56 370399537 0.00 0.00 AttacksTo
2.48 178.54 5.18 421418977 0.00 0.00 SearchControl
2.25 183.24 4.70 31303371 0.00 0.00 GenerateCheckEvasions
2.13 187.69 4.45 125674880 0.00 0.00 EvaluateRooks
1.76 191.36 3.67 6584992 0.00 0.00 EvaluatePawns
1.51 194.52 3.17 12631503 0.00 0.00 GenerateNonCaptures
0.97 196.55 2.03 125674880 0.00 0.00 EvaluateKnights
0.88 198.39 1.84 36017477 0.00 0.00 NextEvasion
0.86 200.20 1.81 125674880 0.00 0.00 EvaluateKings
0.83 201.93 1.73 96027327 0.00 0.00 HashStore
0.61 203.21 1.28 115866513 0.00 0.00 PinnedOnKing
0.58 204.42 1.21 375 0.00 0.00 InitializeHashTables
0.51 205.48 1.07 113029036 0.00 0.00 RepetitionCheck

So while it is possible to beat this, the savings are going to be microscopic...

And this is simpler to write/debug...

hgm · Post by **hgm** » Tue Feb 26, 2008 7:56 am

... And if it takes you one minute each to cut the time in half of thirty routines that each took only 0.5%, you will have gained 7.5% in one hour. But beware, according to Bob (or, more accurately, according to the 100 books he memorized) it would be better to spend a a full day on reducing the time of the routine that takes 30% to 25%. Because gaining 5% speed in a day is much better than gaining 7.5% in a half hour's work. As 30% >> 0.5%, you see...

Well done, Bob!

wgarvin · Post by **wgarvin** » Tue Feb 26, 2008 4:11 pm

hgm wrote:... And if it takes you one minute each to cut the time in half of thirty routines that each took only 0.5%, you will have gained 7.5% in one hour. But beware, according to Bob (or, more accurately, according to the 100 books he memorized) it would be better to spend a a full day on reducing the time of the routine that takes 30% to 25%. Because gaining 5% speed in a day is much better than gaining 7.5% in a half hour's work. As 30% >> 0.5%, you see...

Well done, Bob!

If you can do that many useful optimizations in an hour, then gaining 7.5% that way might make sense.

On the other hand, if there's a routine taking 4.6% of the total, maybe its worth more attention than a routine taking only 0.5%?

There are some unfortunate classes of software where *everything* takes like 0.5 - 1.0%, and then we have no choice but to try and optimize them all one by one---even if each one takes weeks rather than minutes. ("Uniformly slow code"). That doesn't appear to be the case here though. Optimizing the top 10 things in bob's list is probably a better use of the time.

bob · Post by **bob** » Tue Feb 26, 2008 5:55 pm

hgm wrote:... And if it takes you one minute each to cut the time in half of thirty routines that each took only 0.5%, you will have gained 7.5% in one hour. But beware, according to Bob (or, more accurately, according to the 100 books he memorized) it would be better to spend a a full day on reducing the time of the routine that takes 30% to 25%. Because gaining 5% speed in a day is much better than gaining 7.5% in a half hour's work. As 30% >> 0.5%, you see...

Well done, Bob!

Right. And if a frog had pockets, he would carry a gun and not worry about snakes in the swamp either.

Get off the hyperbole of "one minute to cut the time for 30 different modules" and come back to reality where that is not what we are talking about. The repetition list is the simplest way to detect draws. It adds maybe 5 lines of code to an engine. The repetition hash approach is far more than 5 lines, and is more complex to add and debug. It isn't going to happen in one minute, or even in 60 minutes.

So get real, and make sensible arguments, not just posting crap that is useless to someone trying to make progress with their engine...

Fortunately, no book author would think you could get 1/2 of .5% with one minute of effort. And for the case that started this thread, I would be happy to challenge you to add it to Crafty in a total of one minute. Because I have already done it once but found the current approach to be _faster_ in addition to being simpler, particularly in light of parallel search issues.

bob · Post by **bob** » Tue Feb 26, 2008 5:56 pm

wgarvin wrote:
hgm wrote:... And if it takes you one minute each to cut the time in half of thirty routines that each took only 0.5%, you will have gained 7.5% in one hour. But beware, according to Bob (or, more accurately, according to the 100 books he memorized) it would be better to spend a a full day on reducing the time of the routine that takes 30% to 25%. Because gaining 5% speed in a day is much better than gaining 7.5% in a half hour's work. As 30% >> 0.5%, you see...

Well done, Bob!
If you can do that many useful optimizations in an hour, then gaining 7.5% that way might make sense.

On the other hand, if there's a routine taking 4.6% of the total, maybe its worth more attention than a routine taking only 0.5%?

There are some unfortunate classes of software where *everything* takes like 0.5 - 1.0%, and then we have no choice but to try and optimize them all one by one---even if each one takes weeks rather than minutes. ("Uniformly slow code"). That doesn't appear to be the case here though. Optimizing the top 10 things in bob's list is probably a better use of the time.

You will never make that point to someone that can't see the forest for all those damned trees in the way...

hgm · Post by **hgm** » Tue Feb 26, 2008 6:17 pm

bob wrote:Fortunately, no book author would think you could get 1/2 of .5% with one minute of effort.

But I can! Even more. By replacing

Code: Select all

for&#40;i=0; i<nrReversibleMoves; i++)

by

Code: Select all

for&#40;i=hashKey&255; i<repeatTabSize; i++)

That is the case we were discussing here, and that won't take me more than half a minute, and it will cut the average number of iterations in the loop by 80% or more...

Repetition detection structure.

Re: Repetition detection structure.

Re: Repetition detection structure.

Re: Repetition detection structure.

Re: Repetition detection structure.

Re: Repetition detection structure.

Re: Repetition detection structure.

Re: Repetition detection structure.

Re: Repetition detection structure.

Re: Repetition detection structure.

Re: Repetition detection structure.