Repetition detection structure.

Uri Blass · Post by **Uri Blass** » Tue Feb 26, 2008 6:59 pm

hgm wrote:
bob wrote:Fortunately, no book author would think you could get 1/2 of .5% with one minute of effort.
But I can! Even more. By replacing
Code: Select all
for&#40;i=0; i<nrReversibleMoves; i++)
by
Code: Select all
for&#40;i=hashKey&255; i<repeatTabSize; i++)
That is the case we were discussing here, and that won't take me more than half a minute, and it will cut the average number of iterations in the loop by 80% or more...

I do not have repeatTabSize in my program and I simply return draw in first repetition(I have another function to tell me if I can claim a draw and movei claim a draw only when there is a 3 time repetition).

For the record here is my function to detect repetition and other draws(I have one functions for detecting also draw because of other reasons like insufficient material or fifty move rule)

Code: Select all

static int reps&#40;)
&#123;
	int backward;
	if &#40;numpieces<=3&&valuepieces==3&#41;
		return 1;
	if &#40;eval_dat&#91;ply&#93;.isendgame==DRAW&#41;
		if &#40;ply>=1&#41;
			return 1;
	if &#40;fifty<=3&#41;
		return 0;
	if &#40;fifty>=fiftyvalue&#41;
		return 1;
	
	backward=4;
	while (&#40;backward<=fifty&#41;&&&#40;zobkey&#91;hply-backward&#93;!=zobkey&#91;hply&#93;))
			backward+=2;
		if &#40;backward<=fifty&#41;
			return 1;
		else return 0;

&#125;

bob · Post by **bob** » Tue Feb 26, 2008 7:31 pm

hgm wrote:
bob wrote:Fortunately, no book author would think you could get 1/2 of .5% with one minute of effort.
But I can! Even more. By replacing
Code: Select all
for&#40;i=0; i<nrReversibleMoves; i++)
by
Code: Select all
for&#40;i=hashKey&255; i<repeatTabSize; i++)
That is the case we were discussing here, and that won't take me more than half a minute, and it will cut the average number of iterations in the loop by 80% or more...

No it wasn't what we were discussing. We were discussing changing from a simple repetition list to a hash-table implementation of the repetition list. That is absolutely not a 30 second change. It is almost certainly not even a one hour change as it is not done the same way at all.

So, again, can we get back to reality? The original poster asked if there was something more efficient than the repetition list. I replied that since my repetition list only took 0.51% of the total execution time (and I provided profile data that anyone could reproduce if they don't think it is correct), that doing significant work to barely (if at all) improve that 0.51% is simply not worthwhile.

If you believe your approach is significantly faster, something that can certainly be tested easily enough, and that your approach can be added to a program in a couple of minutes or less, then you might have a viable point. But I have done the hash approach in the past, and it was not a 1-2 minute programming change. And then there are still the SMP issues that make the list approach more attractive. And since SMP should be in everyone's future, it certainly makes sense to use the approaches that are most compatible with that longer-term goal...

Finally, your example above is pretty well meaningless. Because (a) you assume that the repetition counter gets pretty large, which it does not except in very rare endings; (b) your loop is shorter, but you have to do it twice, because you have to increment the counter and/or add an entry at the top of Search() and then you have to decrement (and possibly remove) the entry at the bottom of the search. So if your loop is, on average 1/4 the size of mine, you have to do it twice, which makes it 1/2 the size of mine. And then you are doing more work in those where you add/delete entries, which takes it even closer to mine. 80% is a reasonable estimate, meaning if I spend 5 seconds doing this, your approach might do it in 4 seconds. I don't consider that particularly effective use of time when there are other places I could spend that same amount of time and get a bigger return in speedup..

hgm · Post by **hgm** » Tue Feb 26, 2008 7:47 pm

bob wrote:No it wasn't what we were discussing. We were discussing changing from a simple repetition list to a hash-table implementation of the repetition list. That is absolutely not a 30 second change. It is almost certainly not even a one hour change as it is not done the same way at all.

But the above change does make it a hash table. You don't understand C-code?

I don't really understand your remarks about 'doing it twice'. If you implement your repetion list as a stack, you will have to pop the element too, on UnMake. Clearing the entry or decrementing the repeat stack pointer is the same work (involving one memory load and one memory store).

During the repeat test your loop tests for a key-match and you have to test for loop end. I can test for a key match or a key zero. That is also exactly the same amount of work.

bob · Post by **bob** » Tue Feb 26, 2008 8:20 pm

hgm wrote:
bob wrote:No it wasn't what we were discussing. We were discussing changing from a simple repetition list to a hash-table implementation of the repetition list. That is absolutely not a 30 second change. It is almost certainly not even a one hour change as it is not done the same way at all.
But the above change does make it a hash table. You don't understand C-code?

Not quite. How did you get the data _into_ the hash table? How do you remove it later? How do you update/decrement the counter you find there? And what is your loop limit to handle collisions? So that one line change just _broke_ any sane program, because that is not all that is required... And again it doesn't consider the SMP issues..

I don't really understand your remarks about 'doing it twice'. If you implement your repetion list as a stack, you will have to pop the element too, on UnMake. Clearing the entry or decrementing the repeat stack pointer is the same work (involving one memory load and one memory store).

Why would I need to do that? I index the list by "ply". When I back up to the previous ply, I always do ply=ply-1. The next time I step down a ply, it is incremented. Which causes me to overwrite the entry in the list. I never have to remove it at all, as I never check any element beyond "ply"...

During the repeat test your loop tests for a key-match and you have to test for loop end. I can test for a key match or a key zero. That is also exactly the same amount of work.

Except you can't just check one entry or you will miss many repetitions, you have to run down a chain of unknown length. And you have to add the entry and remove it where I don't need to remove mine at all. So you get a double hit on adding and removing, where I just add. And I don't have to worry about the case where that entry is already occupied and chain off of it, I just store it where it goes.

BTW, here is some interesting data. Since I was curious about the very small time spent in RepetitionCheck()) I added two counters to Crafty, one is incremented every time RepetitionCheck() is called, one is incremented every time a repetition check loop is executed. I had crafty play two games against itself without a book, here are the final numbers from those two games:

Game 1 went 72 moves, played at 5 minutes for 60 moves, then 5 minutes per 60 moves, etc...

87899526 RepetitionCheck() calls, 98850190 total loops, average = 1

Game 2 went 85 moves, same time control

126436017 RepetitionCheck() calls, 160747325 total loops, average = 1

The above is integer math. On average, over two much longer than usual games, it averages to about 1.25 loop iterations per RepetitionCheck() call. I am sure it is higher in the latter part of those games, but that is irrelevant to the issue of optimization, as I care about overall performance. Since hashing can't go below 1 iteration period, it looks to me that any savings is going to be tiny, as my 1.25 iterations are _very_ simple iterations, with no need to remove an entry ever, I just have to add them to the end as I advance thru the tree.

As I said, when I tried this, I saw no speed advantage and the added complexity of having a shared repetition table, or the added cost of copying the entire thing on each split made me go back to the repetition list approach.

I am playing a longer game to see if things change, but at move 62, the average is still right at 1, I will report the results later when the two games finish, but they will take a while as each game will take an hour if they get to 30 moves, or two hours if they get to 60... For the first 35 moves, the average was zero (about .75 if not using integer divide). By move 65 it is up to 1.1. No idea where it will end up but it is going to take a _lot_ to get it to average over 2.0

bob · Post by **bob** » Tue Feb 26, 2008 8:32 pm

Here's the final data from the first game, but I am stopping the test here as there is no significant difference between the fast and slow games, overall:

1046961631 RepetitionCheck() calls, 830707699 total loops, average = 0

Still about .8 loop iterations per call. As to how can this happen? if the repetition counter gets reset, the next two plies will get zero iterations since there can't be any repetitions at all...

Alessandro Damiani · Post by **Alessandro Damiani** » Tue Feb 26, 2008 8:59 pm

Thanks for this information Bob. I use a hashtable for years but now I wonder if it really pays off.

bob · Post by **bob** » Tue Feb 26, 2008 9:50 pm

Alessandro Damiani wrote:Thanks for this information Bob. I use a hashtable for years but now I wonder if it really pays off.

it is not clear that it helps or hurts. But gathering the data always introduces new insight into an issue. I was surprised by the loop count average... but it is what it is...

Note that given the right position the counter obviously has to go much higher, but over the entire course of the game, the average is what is important when trying to decide what to optimize and what to leave alone...

hgm · Post by **hgm** » Tue Feb 26, 2008 11:18 pm

bob wrote:Except you can't just check one entry or you will miss many repetitions, you have to run down a chain of unknown length.

Not at all. Positions enter the repeatTable first-in, first-out fashion. So there will never be any holes in the chains. First zero I encounter means it is a miss. And the sought entry is then stored there during the MakeMove.

So you use the 'ply' variable as a stack pointer. Well, you still have to decrement it. My programs son't use a ply variable. They have no need for it... (Except during debugging, when it is enabled through a compiler switch.)

As for the SMP aspect: the way I plan to do SMP does not care how I implement the repeat table, as I won't pass it around.

bob · Post by **bob** » Wed Feb 27, 2008 8:43 am

hgm wrote:
bob wrote:Except you can't just check one entry or you will miss many repetitions, you have to run down a chain of unknown length.
Not at all. Positions enter the repeatTable first-in, first-out fashion. So there will never be any holes in the chains. First zero I encounter means it is a miss. And the sought entry is then stored there during the MakeMove.

So you use the 'ply' variable as a stack pointer. Well, you still have to decrement it. My programs son't use a ply variable. They have no need for it... (Except during debugging, when it is enabled through a compiler switch.)

As for the SMP aspect: the way I plan to do SMP does not care how I implement the repeat table, as I won't pass it around.

So you won't handle repetitions in the parallel searches? Or you will, and allow repetitions from one search that are based on stuff searched in another independent search? Either way is a sure path to disaster...

As far as your store approach, that was exactly what I was talking about. My repetition check appears to average looping over exactly one position Hashing can do no better, and as a result, I find it difficult to imagine any other advantage it would give compared to the two lines of code I need...

hgm · Post by **hgm** » Wed Feb 27, 2008 9:24 am

I already explained how I would handle SMP. So why should I explain it gain. Just read it again? And this time, first look up in one of the many books you can cite what 'split point' means... So that any comments you would feel compelled to make can be to the point, in stead of irrelevant rants.

So in your tree there is on the average only about one move in the table? Well, that still would be a factor 2, as you would have to look at the entry to conclude the miss, and then do a test to conclude that you are done. I conclude that I am done in 99% of the cases immediately, as I hash to an empty table entry.

So it seems that things still as they were from the very beginning: writing hashKey&255 for the initializer of the for-loop in stead of 0 doesn't take me more than half a minute, and the fully-functional hash code I get from this then offers a speeds improvement of about a factor 2...

Oh, and in your scheme, wouldn't you have to keep an administration of where to start or end the linear search, where the last irreversible move was done (or how many ply ago)? If so many of your moves is irreversible, you have to do that a lot... And of course you would have to undo that on UnMake (or copy discard). and that would either involve branches (which can be mispredicted), or dummy operations that you also did on reversible moves.

Repetition detection structure.

Re: Repetition detection structure.

Re: Repetition detection structure.

Re: Repetition detection structure.

Re: Repetition detection structure.

Re: Repetition detection structure.

Re: Repetition detection structure.

Re: Repetition detection structure.

Re: Repetition detection structure.

Re: Repetition detection structure.

Re: Repetition detection structure.