Suggestion for hash tables and analysis.

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Ok, Maybe I didn't make it clear in previous posts...

Post by bob »

Zenmastur wrote:
bob wrote:Easy. One idea is a king triangulation where you do nothing but change the side to move. If this were no worry, no one would find it necessary to include side to move in the hash signature.
A transposition starts in one position and ends in another position. The path between the two positions is what changes. If there is an odd number of plies between the start positions then it can't be a transposition because a different side is on the move. Likewise, if there is an odd number of plies between the ending position then the ending positions can't be identical for the same reason. If you are allowing transpositions that don't start from the same position then it's length has little meaning.

I think these ideas would be obvious even to the most casual observer.

So, I don't know what you're measuring but it doesn't seem to be transpositions.

In any case, I assume that these numbers were supposed to support your claim that it's
bob wrote:I can transpose near a tip, with a position near the root, just as easily as I can transpose between branches near the tip.
I don't think think they do. In fact, they seem to support this idea:
Zenmastur wrote:I'm not sure you can transpose just as easily from root to tip. Long transpositions, while clearly possible, seem harder.
Regards,

Zen
What about these cases:

Kings start on e1 and e8 white to move

Ke2 Ke7 Kd1 Ke8 Ke1 black to move

It is TRIVIAL to change side to move without changing the position... As I said, I am not sure you have studied this particular aspect as much as needed. This kind of thing is COMMON in fine 70. Why do you think Kb2 draws while Kb1 wins? In one you reach a position where black is in zugzwang with black to move, in one you reach the same position where white is in zugzwang. A common theme in king and pawn endings.

What I am measuring is 100% certain to be transpositions. But you haven't thought this through yet... once you do, you will see that same position different side on move is easy, and from there you can do same position, same side to move, different number of moves between the transpositions. Either side can toss tempi at will.

My measurement was simply to take EVERY hash hit (ones where draft was acceptable, >= depth remaining) and subtract depth remaining from draft. Which is remaining plies. And then there is LMR, which can reduce by 1 ply, or by 2 plies, or by 3 plies, for the same position, giving even more variability in draft... There is a lot going on in the search. And a lot of ways to alter the depth (plies) that you might not think about when thinking of "human chess".

I'd suspect those 1 ply differences were the effect of reductions, which for me would increase with each successive iteration...

Remember that depth and draft have little to do with path length nowadays...
Zenmastur
Posts: 919
Joined: Sat May 31, 2014 8:28 am

Re: Ok, Maybe I didn't make it clear in previous posts...

Post by Zenmastur »

bob wrote:
Zenmastur wrote:
bob wrote:Easy. One idea is a king triangulation where you do nothing but change the side to move. If this were no worry, no one would find it necessary to include side to move in the hash signature.
A transposition starts in one position and ends in another position. The path between the two positions is what changes. If there is an odd number of plies between the start positions then it can't be a transposition because a different side is on the move. Likewise, if there is an odd number of plies between the ending position then the ending positions can't be identical for the same reason. If you are allowing transpositions that don't start from the same position then it's length has little meaning.

I think these ideas would be obvious even to the most casual observer.

So, I don't know what you're measuring but it doesn't seem to be transpositions.

In any case, I assume that these numbers were supposed to support your claim that it's
bob wrote:I can transpose near a tip, with a position near the root, just as easily as I can transpose between branches near the tip.
I don't think think they do. In fact, they seem to support this idea:
Zenmastur wrote:I'm not sure you can transpose just as easily from root to tip. Long transpositions, while clearly possible, seem harder.
Regards,

Zen
What about these cases:

Kings start on e1 and e8 white to move

Ke2 Ke7 Kd1 Ke8 Ke1 black to move

It is TRIVIAL to change side to move without changing the position... As I said, I am not sure you have studied this particular aspect as much as needed. This kind of thing is COMMON in fine 70. Why do you think Kb2 draws while Kb1 wins? In one you reach a position where black is in zugzwang with black to move, in one you reach the same position where white is in zugzwang. A common theme in king and pawn endings.
Until a pawn is captured that is the only type of transposition possible. So, I guess you could call it common. I have given it sufficient thought too.
bob wrote:What I am measuring is 100% certain to be transpositions. But you haven't thought this through yet... once you do, you will see that same position different side on move is easy, and from there you can do same position, same side to move, different number of moves between the transpositions. Either side can toss tempi at will.
I don't think you have thought it through properly. I have no doubt you are finding transpositions. That I don't think is in question. The part I have a problem with is the supposed starting position(s) for the transpositions found. You are reporting that they have different starting positions. In fact, according to your table they MUST have a different side on the move. This isn't possible. Which means the reported differences are obviously wrong.
bob wrote:My measurement was simply to take EVERY hash hit (ones where draft was acceptable, >= depth remaining) and subtract depth remaining from draft. Which is remaining plies. And then there is LMR, which can reduce by 1 ply, or by 2 plies, or by 3 plies, for the same position, giving even more variability in draft... There is a lot going on in the search. And a lot of ways to alter the depth (plies) that you might not think about when thinking of "human chess".

I'd suspect those 1 ply differences were the effect of reductions, which for me would increase with each successive iteration...
What ever the reason for the wrongly reported differences in depth, they are still wrong and the data can't be used to draw any conclusions. If the data were accurate, the only conclusion that could be drawn from it, would not be favorable to the point you are apparently trying to make. In fact the data seems to support my position and not yours.

But all of this is a side line to the main topic, which is will a split TT help a highly over-subscribed TT.

You think it will cause the tree to blow up, I think it will cause the tree to shrink by a factor of several times.

Regards,

Zen
Only 2 defining forces have ever offered to die for you.....Jesus Christ and the American Soldier. One died for your soul, the other for your freedom.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Ok, Maybe I didn't make it clear in previous posts...

Post by bob »

Zenmastur wrote:
bob wrote:
Zenmastur wrote:
bob wrote:Easy. One idea is a king triangulation where you do nothing but change the side to move. If this were no worry, no one would find it necessary to include side to move in the hash signature.
A transposition starts in one position and ends in another position. The path between the two positions is what changes. If there is an odd number of plies between the start positions then it can't be a transposition because a different side is on the move. Likewise, if there is an odd number of plies between the ending position then the ending positions can't be identical for the same reason. If you are allowing transpositions that don't start from the same position then it's length has little meaning.

I think these ideas would be obvious even to the most casual observer.

So, I don't know what you're measuring but it doesn't seem to be transpositions.

In any case, I assume that these numbers were supposed to support your claim that it's
bob wrote:I can transpose near a tip, with a position near the root, just as easily as I can transpose between branches near the tip.
I don't think think they do. In fact, they seem to support this idea:
Zenmastur wrote:I'm not sure you can transpose just as easily from root to tip. Long transpositions, while clearly possible, seem harder.
Regards,

Zen
What about these cases:

Kings start on e1 and e8 white to move

Ke2 Ke7 Kd1 Ke8 Ke1 black to move

It is TRIVIAL to change side to move without changing the position... As I said, I am not sure you have studied this particular aspect as much as needed. This kind of thing is COMMON in fine 70. Why do you think Kb2 draws while Kb1 wins? In one you reach a position where black is in zugzwang with black to move, in one you reach the same position where white is in zugzwang. A common theme in king and pawn endings.
Until a pawn is captured that is the only type of transposition possible. So, I guess you could call it common. I have given it sufficient thought too.
I don't understand why you keep interjecting that pawn capture comment. If you have a path that is 20 moves long, you can play that single pawn capture anywhere in that path to transpose the order of moves yet reach the same final position. So captures, pushes or other moves have no effect whatsoever on transpositions.

Again, this is NOT about repetition detection where captures cut repetitions off stone cold at that point.

bob wrote:What I am measuring is 100% certain to be transpositions. But you haven't thought this through yet... once you do, you will see that same position different side on move is easy, and from there you can do same position, same side to move, different number of moves between the transpositions. Either side can toss tempi at will.
I don't think you have thought it through properly. I have no doubt you are finding transpositions. That I don't think is in question. The part I have a problem with is the supposed starting position(s) for the transpositions found. You are reporting that they have different starting positions. In fact, according to your table they MUST have a different side on the move. This isn't possible. Which means the reported differences are obviously wrong.
bob wrote:My measurement was simply to take EVERY hash hit (ones where draft was acceptable, >= depth remaining) and subtract depth remaining from draft. Which is remaining plies. And then there is LMR, which can reduce by 1 ply, or by 2 plies, or by 3 plies, for the same position, giving even more variability in draft... There is a lot going on in the search. And a lot of ways to alter the depth (plies) that you might not think about when thinking of "human chess".

I'd suspect those 1 ply differences were the effect of reductions, which for me would increase with each successive iteration...
What ever the reason for the wrongly reported differences in depth, they are still wrong and the data can't be used to draw any conclusions. If the data were accurate, the only conclusion that could be drawn from it, would not be favorable to the point you are apparently trying to make. In fact the data seems to support my position and not yours.

But all of this is a side line to the main topic, which is will a split TT help a highly over-subscribed TT.

You think it will cause the tree to blow up, I think it will cause the tree to shrink by a factor of several times.

Regards,

Zen
I am not reporting starting move numbers. All I am reporting is the difference between the depth where the entry was stored, and the depth where that entry was successfully probed. That has nothing to do with specific move numbers or side to move or anything other than draft(stored) - draft(hit) and that can't be negative or the entry would not have been used (draft too shallow).

As far as "highly oversubscribed" goes, how about let's establishing a precise definition. Any way you want, something of the flavor of total positions written (not overwritten) compared to table size?

Just for starters, here's a middle game position run from Crafty, giving total nodes searched, total hash stores done, total positions overwritten. The hash size was 16gb which is pretty typical nowadays.

searched=2,454,539,743 stores=112,548,860 overwrites=73,818,068

Searched 2.4 billion nodes, clobbered .1b existing entries, overwrote .073b existing entries with same key, different info.

To even reach the theoretical 100% usage here would need on the order of 24 billion nodes searched. And that is hardly "highly oversubscribed".

For this middle game position, it seems the qsearch nodes outnumber the normal nodes about 20:1, or that about 5% of the total nodes searched actually get stored in the trans/ref table, period. I do checks in the q-search and such, but NO hash probing anywhere out there.

Given 1 billion entries, that would mean a search of about 20B nodes is needed to even approximate "filling the table" once. On the hardware I typically use, I am hitting around 100M nodes per second. 200 seconds would reach that point, but not even approximating any sort of "highly oversubscribed" condition.

So before we continue, what does "highly oversubscribed" mean to you? Perhaps 16gb of hash with a 24 hour search might do the trick???

But whatever you do, the part of the search in the last 5 plies or so is the majority of the search, so a small table out there is NOT going to be helpful. Might as well just not store the last 5 plies and save the overhead. I can certainly measure what that will do to search performance...
Zenmastur
Posts: 919
Joined: Sat May 31, 2014 8:28 am

Re: Ok, Maybe I didn't make it clear in previous posts...

Post by Zenmastur »

bob wrote:
Zenmastur wrote:
bob wrote:
Zenmastur wrote:
bob wrote:Easy. One idea is a king triangulation where you do nothing but change the side to move. If this were no worry, no one would find it necessary to include side to move in the hash signature.
A transposition starts in one position and ends in another position. The path between the two positions is what changes. If there is an odd number of plies between the start positions then it can't be a transposition because a different side is on the move. Likewise, if there is an odd number of plies between the ending position then the ending positions can't be identical for the same reason. If you are allowing transpositions that don't start from the same position then it's length has little meaning.

I think these ideas would be obvious even to the most casual observer.

So, I don't know what you're measuring but it doesn't seem to be transpositions.

In any case, I assume that these numbers were supposed to support your claim that it's
bob wrote:I can transpose near a tip, with a position near the root, just as easily as I can transpose between branches near the tip.
I don't think think they do. In fact, they seem to support this idea:
Zenmastur wrote:I'm not sure you can transpose just as easily from root to tip. Long transpositions, while clearly possible, seem harder.
Regards,

Zen
What about these cases:

Kings start on e1 and e8 white to move

Ke2 Ke7 Kd1 Ke8 Ke1 black to move

It is TRIVIAL to change side to move without changing the position... As I said, I am not sure you have studied this particular aspect as much as needed. This kind of thing is COMMON in fine 70. Why do you think Kb2 draws while Kb1 wins? In one you reach a position where black is in zugzwang with black to move, in one you reach the same position where white is in zugzwang. A common theme in king and pawn endings.
Until a pawn is captured that is the only type of transposition possible. So, I guess you could call it common. I have given it sufficient thought too.
I don't understand why you keep interjecting that pawn capture comment. If you have a path that is 20 moves long, you can play that single pawn capture anywhere in that path to transpose the order of moves yet reach the same final position. So captures, pushes or other moves have no effect whatsoever on transpositions.

Again, this is NOT about repetition detection where captures cut repetitions off stone cold at that point.
The pawn capture only relates to fine#70 not the general case. In fine#70 the only transpositions that can occur are king triangulations, UNTIL a pawn is captured. I'm not saying anything about transpositions in general. Your data is from fine#70 up to ply 26, so my comment is about transpositions in fine#70 up to ply 26.
bob wrote:What I am measuring is 100% certain to be transpositions. But you haven't thought this through yet... once you do, you will see that same position different side on move is easy, and from there you can do same position, same side to move, different number of moves between the transpositions. Either side can toss tempi at will.
bob wrote:I am not reporting starting move numbers. All I am reporting is the difference between the depth where the entry was stored, and the depth where that entry was successfully probed.

That has nothing to do with specific move numbers or side to move or anything other than draft(stored) - draft(hit) and that can't be negative or the entry would not have been used (draft too shallow).
Think about these statements for a minute. If you find a transposition, then the ending position MUST be the same or it's not a transposition. This means the same side was on the move. You are reporting an odd difference between the depths of the current line of play and the line of play that lead to the position being placed in the transposition table. Therefore, one of the lines of play had to have white on the move and the other had to have black on the move. ALL transpositions must start from an identical position at some point in the past. Therefore the depth differences you're reporting can't be right! Why they're not right, I don't know, but in the end, I don't care, because I don't think it's central to this discussion!
bob wrote:As far as "highly oversubscribed" goes, how about let's establishing a precise definition. Any way you want, something of the flavor of total positions written (not overwritten) compared to table size?
I thought I already did that. The precise definition I'm using for “highly over subscribed” is:

Unique writes to the TT >= n * ln(n) + n/2

Where “n” is the number of entries in the TT. Anything less than this number of unique writes but greater than “n” is “over-subscribed” and anything less than “n” is under-subscribed.
bob wrote:Just for starters, here's a middle game position run from Crafty, giving total nodes searched, total hash stores done, total positions overwritten. The hash size was 16gb which is pretty typical nowadays.

searched=2,454,539,743 stores=112,548,860 overwrites=73,818,068

Searched 2.4 billion nodes, clobbered .1b existing entries, overwrote .073b existing entries with same key, different info.

To even reach the theoretical 100% usage here would need on the order of 24 billion nodes searched. And that is hardly "highly oversubscribed".
According to the formula given this isn't a “highly over-subscribed” TT. In fact, assuming 16-byte TT entries for a total of 1B entries means the TT is under-subscribed. Crafty would need about 500B nodes to highly over-subscribe this size of TT. Stockfish would only need about 25B nodes. The difference is probing in the q-search. While SF may start degrading “earlier” it does it in a much more graceful and gradual manner. When Crafty reach the tipping point for TT effectiveness there is a dramatic change in tree size and time required to reach a given depth.
bob wrote:For this middle game position, it seems the qsearch nodes outnumber the normal nodes about 20:1, or that about 5% of the total nodes searched actually get stored in the trans/ref table, period. I do checks in the q-search and such, but NO hash probing anywhere out there.

Given 1 billion entries, that would mean a search of about 20B nodes is needed to even approximate "filling the table" once. On the hardware I typically use, I am hitting around 100M nodes per second. 200 seconds would reach that point, but not even approximating any sort of "highly oversubscribed" condition.

So before we continue, what does "highly oversubscribed" mean to you? Perhaps 16gb of hash with a 24 hour search might do the trick???
At 100Mnps about 85 minutes. I did extensive tests with crafty with small but highly over-subscribed TT's. When the number of nodes reported exceeds ~ 250 times the TT size the tree begins to explode. It immediately doubles or triples the tree size and it gets worse from there. This corresponds to an over-subscription factor of about 12.5 to 1. It's hard to pin it down better than this because the node count reported doesn't correspond to the probe count much less the number of unique writes.

In any case, the tree continued to grow, as compared to a control with a much larger TT. Before it became too tedious to continue, the difference exceeded a factor of 37 to 1. At that point I stopped testing the smallest TT because it was taking too much time and I had other tests I wanted to run. My original intent was to run this test until ply 45 had been reached. But it was obvious that crafty wasn't going to get to that depth within a reasonable period of time so I stopped it at ply 28 for the small TT and ply 33 for the large and intermediate size TT's. I did the same for SF but the small TT got to 43 plies the intermediate to 44 and the large one to 50 plies.

bob wrote:But whatever you do, the part of the search in the last 5 plies or so is the majority of the search, so a small table out there is NOT going to be helpful. Might as well just not store the last 5 plies and save the overhead. I can certainly measure what that will do to search performance...
Well... a couple of thoughts.

5-plies is too many. You keep mentioning this number. I've determined there is little point to splitting it at 5-plies. With a 16Gb L2 TT and 5-plies directed at the L1 TT would support search sizes in excess of 75 trillion unique nodes. Seems a little excessive for most uses, although I must admit that I just completed a 30-trillion node search of a position, so it's possible that this could be useful.

I also thought about not storing them at all but I think that would be very bad for time to depth. You seem to think that what ever is stored in the L1 TT will be completely worthless. I don't!

I think it will catch a large number of transpositions. Most transpositions are short. Three and four ply transpositions are the most plentiful. I plan on catching most of these, even if they cross the L1-L2 TT boundary. But these aren't the most important one to catch as I have already pointed out.

By far the most important one's occur entirely within the L2 TT boundary. I plan on finding virtually ALL of these. A highly over-subscribe TT will miss huge numbers off these as witnessed by the fact that Crafty's tree blows up by a factor of 10 or more when it runs into this condition.

One last comment, I suspect that if you put a "local" 4k, or 8k entry TT in crafty for use in the q-search “only” you would see a significant drop in time to depth in general. You said that 95% of the nodes in a middle game position come from the q-search. So why not make a small TT to catch the transpositions there?

I looked at doing this for crafty and concluded that it won't solve the over-subscribed TT problem. But crafty doesn't seem to suffer this fate until later than most engines due to a lack of probing in the q-search. But as fast as crafty is in raw NPS, it lags behind other engines in time to depth, because it doesn't probe in the q-search. So it would seem that the solution to this problem would be to place a small local TT to catch all the temporally local transpositions. This should improve time to depth by quite a bit I would think.

Everyone seems to over look the fact that small TT's work. The reason they work is that most transpositions are temporally local. (i.e. a position is more likely to be seen again in a few number of nodes than it is if a very large number of nodes have passed.) If transpositions weren't highly temporally local small TT's would be worthless.

Regards,

Zen
Only 2 defining forces have ever offered to die for you.....Jesus Christ and the American Soldier. One died for your soul, the other for your freedom.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Ok, Maybe I didn't make it clear in previous posts...

Post by bob »

Zenmastur wrote:
bob wrote:
Zenmastur wrote:
bob wrote:
Zenmastur wrote:
bob wrote:Easy. One idea is a king triangulation where you do nothing but change the side to move. If this were no worry, no one would find it necessary to include side to move in the hash signature.
A transposition starts in one position and ends in another position. The path between the two positions is what changes. If there is an odd number of plies between the start positions then it can't be a transposition because a different side is on the move. Likewise, if there is an odd number of plies between the ending position then the ending positions can't be identical for the same reason. If you are allowing transpositions that don't start from the same position then it's length has little meaning.

I think these ideas would be obvious even to the most casual observer.

So, I don't know what you're measuring but it doesn't seem to be transpositions.

In any case, I assume that these numbers were supposed to support your claim that it's
bob wrote:I can transpose near a tip, with a position near the root, just as easily as I can transpose between branches near the tip.
I don't think think they do. In fact, they seem to support this idea:
Zenmastur wrote:I'm not sure you can transpose just as easily from root to tip. Long transpositions, while clearly possible, seem harder.
Regards,

Zen
What about these cases:

Kings start on e1 and e8 white to move

Ke2 Ke7 Kd1 Ke8 Ke1 black to move

It is TRIVIAL to change side to move without changing the position... As I said, I am not sure you have studied this particular aspect as much as needed. This kind of thing is COMMON in fine 70. Why do you think Kb2 draws while Kb1 wins? In one you reach a position where black is in zugzwang with black to move, in one you reach the same position where white is in zugzwang. A common theme in king and pawn endings.
Until a pawn is captured that is the only type of transposition possible. So, I guess you could call it common. I have given it sufficient thought too.
I don't understand why you keep interjecting that pawn capture comment. If you have a path that is 20 moves long, you can play that single pawn capture anywhere in that path to transpose the order of moves yet reach the same final position. So captures, pushes or other moves have no effect whatsoever on transpositions.

Again, this is NOT about repetition detection where captures cut repetitions off stone cold at that point.
The pawn capture only relates to fine#70 not the general case. In fine#70 the only transpositions that can occur are king triangulations, UNTIL a pawn is captured. I'm not saying anything about transpositions in general. Your data is from fine#70 up to ply 26, so my comment is about transpositions in fine#70 up to ply 26.
bob wrote:What I am measuring is 100% certain to be transpositions. But you haven't thought this through yet... once you do, you will see that same position different side on move is easy, and from there you can do same position, same side to move, different number of moves between the transpositions. Either side can toss tempi at will.
bob wrote:I am not reporting starting move numbers. All I am reporting is the difference between the depth where the entry was stored, and the depth where that entry was successfully probed.

That has nothing to do with specific move numbers or side to move or anything other than draft(stored) - draft(hit) and that can't be negative or the entry would not have been used (draft too shallow).
Think about these statements for a minute. If you find a transposition, then the ending position MUST be the same or it's not a transposition. This means the same side was on the move. You are reporting an odd difference between the depths of the current line of play and the line of play that lead to the position being placed in the transposition table. Therefore, one of the lines of play had to have white on the move and the other had to have black on the move. ALL transpositions must start from an identical position at some point in the past. Therefore the depth differences you're reporting can't be right! Why they're not right, I don't know, but in the end, I don't care, because I don't think it's central to this discussion!
That is simply incorrect. The difference in draft between two instances where a position was stored and where it was retrieved has NOTHING to do with side to move, which must always match. But A single ply reduction before either the store or the retrieval will change the distance by exactly 1 in terms of depth remaining to the end of the search, NOT the distance between the two nodes where the table is accessed.

When I store a position P at ply=10, wtm=1, draft=20 (we are doing a 30 ply search at the point where P is stored) I can then later match P with a remaining depth of 20, 21, 22, 23, 24, 25, ... giving differences of 0, 1, ...



bob wrote:As far as "highly oversubscribed" goes, how about let's establishing a precise definition. Any way you want, something of the flavor of total positions written (not overwritten) compared to table size?
I thought I already did that. The precise definition I'm using for “highly over subscribed” is:

Unique writes to the TT >= n * ln(n) + n/2

Where “n” is the number of entries in the TT. Anything less than this number of unique writes but greater than “n” is “over-subscribed” and anything less than “n” is under-subscribed.
bob wrote:Just for starters, here's a middle game position run from Crafty, giving total nodes searched, total hash stores done, total positions overwritten. The hash size was 16gb which is pretty typical nowadays.

searched=2,454,539,743 stores=112,548,860 overwrites=73,818,068

Searched 2.4 billion nodes, clobbered .1b existing entries, overwrote .073b existing entries with same key, different info.

To even reach the theoretical 100% usage here would need on the order of 24 billion nodes searched. And that is hardly "highly oversubscribed".
According to the formula given this isn't a “highly over-subscribed” TT. In fact, assuming 16-byte TT entries for a total of 1B entries means the TT is under-subscribed. Crafty would need about 500B nodes to highly over-subscribe this size of TT. Stockfish would only need about 25B nodes. The difference is probing in the q-search. While SF may start degrading “earlier” it does it in a much more graceful and gradual manner. When Crafty reach the tipping point for TT effectiveness there is a dramatic change in tree size and time required to reach a given depth.
bob wrote:For this middle game position, it seems the qsearch nodes outnumber the normal nodes about 20:1, or that about 5% of the total nodes searched actually get stored in the trans/ref table, period. I do checks in the q-search and such, but NO hash probing anywhere out there.

Given 1 billion entries, that would mean a search of about 20B nodes is needed to even approximate "filling the table" once. On the hardware I typically use, I am hitting around 100M nodes per second. 200 seconds would reach that point, but not even approximating any sort of "highly oversubscribed" condition.

So before we continue, what does "highly oversubscribed" mean to you? Perhaps 16gb of hash with a 24 hour search might do the trick???
At 100Mnps about 85 minutes. I did extensive tests with crafty with small but highly over-subscribed TT's. When the number of nodes reported exceeds ~ 250 times the TT size the tree begins to explode. It immediately doubles or triples the tree size and it gets worse from there. This corresponds to an over-subscription factor of about 12.5 to 1. It's hard to pin it down better than this because the node count reported doesn't correspond to the probe count much less the number of unique writes.

In any case, the tree continued to grow, as compared to a control with a much larger TT. Before it became too tedious to continue, the difference exceeded a factor of 37 to 1. At that point I stopped testing the smallest TT because it was taking too much time and I had other tests I wanted to run. My original intent was to run this test until ply 45 had been reached. But it was obvious that crafty wasn't going to get to that depth within a reasonable period of time so I stopped it at ply 28 for the small TT and ply 33 for the large and intermediate size TT's. I did the same for SF but the small TT got to 43 plies the intermediate to 44 and the large one to 50 plies.

bob wrote:But whatever you do, the part of the search in the last 5 plies or so is the majority of the search, so a small table out there is NOT going to be helpful. Might as well just not store the last 5 plies and save the overhead. I can certainly measure what that will do to search performance...
Well... a couple of thoughts.

5-plies is too many. You keep mentioning this number. I've determined there is little point to splitting it at 5-plies. With a 16Gb L2 TT and 5-plies directed at the L1 TT would support search sizes in excess of 75 trillion unique nodes. Seems a little excessive for most uses, although I must admit that I just completed a 30-trillion node search of a position, so it's possible that this could be useful.

I also thought about not storing them at all but I think that would be very bad for time to depth. You seem to think that what ever is stored in the L1 TT will be completely worthless. I don't!

I think it will catch a large number of transpositions. Most transpositions are short. Three and four ply transpositions are the most plentiful. I plan on catching most of these, even if they cross the L1-L2 TT boundary. But these aren't the most important one to catch as I have already pointed out.

By far the most important one's occur entirely within the L2 TT boundary. I plan on finding virtually ALL of these. A highly over-subscribe TT will miss huge numbers off these as witnessed by the fact that Crafty's tree blows up by a factor of 10 or more when it runs into this condition.

One last comment, I suspect that if you put a "local" 4k, or 8k entry TT in crafty for use in the q-search “only” you would see a significant drop in time to depth in general. You said that 95% of the nodes in a middle game position come from the q-search. So why not make a small TT to catch the transpositions there?

I looked at doing this for crafty and concluded that it won't solve the over-subscribed TT problem. But crafty doesn't seem to suffer this fate until later than most engines due to a lack of probing in the q-search. But as fast as crafty is in raw NPS, it lags behind other engines in time to depth, because it doesn't probe in the q-search. So it would seem that the solution to this problem would be to place a small local TT to catch all the temporally local transpositions. This should improve time to depth by quite a bit I would think.

Everyone seems to over look the fact that small TT's work. The reason they work is that most transpositions are temporally local. (i.e. a position is more likely to be seen again in a few number of nodes than it is if a very large number of nodes have passed.) If transpositions weren't highly temporally local small TT's would be worthless.

Regards,

Zen
I do not see any examples of your "dramatic tipping point" however. I only ran one test, but let fine70 run for an hour at 100M nodes per second with a 1gb hash table and it loped along at about the same branching factor forever. However, it does find a forced mate in 30-something which might or might not help.

Yes it will degrade over time, but a decent replacement policy makes that degradation pretty gradual in testing I have done. And I am not interested in wrecking the performance of normal searches where the table is not even fully utilized, to try to fix a problem where the table is simply way too small. Too small is too small, period, and no matter what one does, the table ends up too small and the search is not as efficient as it could be.

While small TTs work, and since most do an "always store" replacement to address this anyway, I don't see the benefit of an artificial approach to try to preserve the part of the table you say is not so useful (deep draft entries). Yet we know they ARE worthwhile otherwise a small table would work forever in fine70, regardless of the depth.

I NEVER like to "build a wall" anywhere, as it always ends up located in the wrong place. Building a wall between these two TTs is exactly the idea I don't like. You can only get it in exactly the right place with luck, for exactly the right position. For the rest it will be sub-optimal and for most it will be worse than just one large table.
Zenmastur
Posts: 919
Joined: Sat May 31, 2014 8:28 am

Re: Ok, Maybe I didn't make it clear in previous posts...

Post by Zenmastur »

bob wrote:
Zenmastur wrote:Think about these statements for a minute. If you find a transposition, then the ending position MUST be the same or it's not a transposition. This means the same side was on the move. You are reporting an odd difference between the depths of the current line of play and the line of play that lead to the position being placed in the transposition table. Therefore, one of the lines of play had to have white on the move and the other had to have black on the move. ALL transpositions must start from an identical position at some point in the past. Therefore the depth differences you're reporting can't be right! Why they're not right, I don't know, but in the end, I don't care, because I don't think it's central to this discussion!
That is simply incorrect. The difference in draft between two instances where a position was stored and where it was retrieved has NOTHING to do with side to move, which must always match. But A single ply reduction before either the store or the retrieval will change the distance by exactly 1 in terms of depth remaining to the end of the search, NOT the distance between the two nodes where the table is accessed.

When I store a position P at ply=10, wtm=1, draft=20 (we are doing a 30 ply search at the point where P is stored) I can then later match P with a remaining depth of 20, 21, 22, 23, 24, 25, ... giving differences of 0, 1, ...
The two MOST important pieces of data are the number of transposition and their depth. You can't discern either of these pieces of information from your data. So tell me, what good is this data. It is strong evidence that your claim about transpositions is wrong. Other than that, I see no use for the data.
bob wrote:But whatever you do, the part of the search in the last 5 plies or so is the majority of the search, so a small table out there is NOT going to be helpful. Might as well just not store the last 5 plies and save the overhead. I can certainly measure what that will do to search performance...
Well... a couple of thoughts.

5-plies is too many. You keep mentioning this number. I've determined there is little point to splitting it at 5-plies. With a 16Gb L2 TT and 5-plies directed at the L1 TT would support search sizes in excess of 75 trillion unique nodes. Seems a little excessive for most uses, although I must admit that I just completed a 30-trillion node search of a position, so it's possible that this could be useful.

I also thought about not storing them at all but I think that would be very bad for time to depth. You seem to think that what ever is stored in the L1 TT will be completely worthless. I don't!

I think it will catch a large number of transpositions. Most transpositions are short. Three and four ply transpositions are the most plentiful. I plan on catching most of these, even if they cross the L1-L2 TT boundary. But these aren't the most important one to catch as I have already pointed out.

By far the most important one's occur entirely within the L2 TT boundary. I plan on finding virtually ALL of these. A highly over-subscribe TT will miss huge numbers off these as witnessed by the fact that Crafty's tree blows up by a factor of 10 or more when it runs into this condition.

One last comment, I suspect that if you put a "local" 4k, or 8k entry TT in crafty for use in the q-search “only” you would see a significant drop in time to depth in general. You said that 95% of the nodes in a middle game position come from the q-search. So why not make a small TT to catch the transpositions there?

I looked at doing this for crafty and concluded that it won't solve the over-subscribed TT problem. But crafty doesn't seem to suffer this fate until later than most engines due to a lack of probing in the q-search. But as fast as crafty is in raw NPS, it lags behind other engines in time to depth, because it doesn't probe in the q-search. So it would seem that the solution to this problem would be to place a small local TT to catch all the temporally local transpositions. This should improve time to depth by quite a bit I would think.

Everyone seems to over look the fact that small TT's work. The reason they work is that most transpositions are temporally local. (i.e. a position is more likely to be seen again in a few number of nodes than it is if a very large number of nodes have passed.) If transpositions weren't highly temporally local small TT's would be worthless.

Regards,

Zen[/quote]
bob wrote:I do not see any examples of your "dramatic tipping point" however. I only ran one test, but let fine70 run for an hour at 100M nodes per second with a 1gb hash table and it loped along at about the same branching factor forever. However, it does find a forced mate in 30-something which might or might not help.
You didn't find any evidence of a dramatic tipping point because you're trying very hard not to find it.

Image

Can you see it now?

Crafty's EBF goes from 1.70 to 1.92 because the TT is over-subscribed. It doesn't get any more obvious than this.

This isn't from Fine#70 because Fine#70 is absolutely worthless for testing this!

First, it's a mate in 32. Until a pawn is captured, which should happen @ ply 27, there are only around 5,200 legal positions. After a pawn is captured the number of legal positions goes up, but most positions will be cut from the tree. White will see big gains from pushing the f pawn and capturing the d pawn. Black's king will try to stop the pawn pushes which constrains where it can go on the board. So, very few of the “possible” positions will actually be seen in the search tree.
[pgn]
[Event "Fine #70"]
[Site "?"]
[Date "1901.06.20"]
[Round "?"]
[White "Lasker, Emanuel"]
[Black "Reichhelm, Gustavus Charles"]
[Result "1-0"]
[WhiteElo "2600"]
[BlackElo "2400"]
[FEN "8/k7/3p4/p2P1p2/P2P1P2/8/8/K7 w - - 0 1"]

{[Stockfish 210616 64 BMI2] 104:M32}
1. Kb1 Ka8 2. Kb2 Ka7 3. Kb3 Ka6 4. Kc2 Kb7 5. Kc3 Kb6 6. Kd2 Kc7 7. Kd3
Kb6 8. Ke2 Kc7 9. Kf3 Kd7 10. Kg3 Ke7 11. Kh4 Kf6 12. Kh5 Kf7 13. Kg5 Kg7
14. Kxf5 Kf7 15. Kg5 Kg7 16. f5 Kf7 17. f6 Kf8 18. Kf4 Ke8 19. Kg4 Kf8 20.
Kg5 Kf7 21. Kf5 Kf8 22. Ke6 Ke8 23. Kxd6 Kf7 24. Ke5 Kf8 25. Ke6 Kg8 26.
d6 Kh7 27. d7 Kg6 28. f7 Kh5 29. d8=Q Kg4 30. f8=Q Kh3 31. Qg5 Kh2 32.
Qfh6# 1-0 [/pgn]

The critical positions are:

[d]8/6k1/3p4/p2P1pK1/P2P1P2/8/8/8 w - - 26 14

The first pawn is lost. Mate in 19.

[d]4k3/8/3pKP2/p2P4/P2P4/8/8/8 w - - 11 23

The second pawn is lost. Mate in 10.

[d]8/3P1P2/4K3/p6k/P2P4/8/8/8 w - - 1 29

Both pawns are going to queen. Mate in 4.

The sub-goals don't allow either side much wiggle room. So, with a well designed search you are going to see a only very small fraction of the “possible” legal positions. You pointed out that crafty could solve this problem with a 4K entry TT so it shouldn't be a big surprise that a 1B entry TT doesn't show a tree explosion regardless of how long the search is allowed to run.

I did my tests on this position.

[d]6k1/2p2rp1/p6p/P2b3N/5P2/7P/2p4K/2B1R3 b - - 0 41

This position is probably a forced mate with white winning just like fine#70.The difference is the mate is far enough away that it will only have an indirect affect on the search. There are enough pieces on the board that a reasonable sized TT can be oversubscribed. Of course the larger the TT the more time it takes to reach the oversubscribed condition and the longer the tests will take. I used TT's with 4,096 , 2 million, and 2 billion entries. I did several tests for each TT size and then averaged them. A different graph of the same data:

Image

The data:

Code: Select all

         K nodes           K nodes           K nodes        K nodes
Ply     Crafty4K          Crafty2M          Crafty2B           SF3B
 4             2                 2                 2                                                                                       
 5             4                 4                 4                                                             
 6            12                12                12                                                                 
 7            22                20                22
 8            58                59                60                                                                
 9           185               200               168                                                               
10           250               248               284                                                               
11           398               546               553                                                                 
12           767               777               846                                                                   
13         2,131             1,278             1,269             563
14         4,456             3,343             2,088             977                                               
15        10,590             4,706             5,409           1,421                                                    
16        19,891            12,072            11,575           2,037                                                   
17        41,448            19,884            16,395           4,125                                                 
18        94,572            32,958            25,393           4,251                                                          
19       210,793            52,902            43,671           4,960                                                        
20       241,436            78,152            68,572           7,690                                                    
21     1,017,325           114,352            98,058          10,215                                                       
22     2,016,137           226,976           217,419          15,029                                                      
23     2,602,419           444,451           391,438          17,609                                                           
24     4,153,717           808,651           509,763          33,241                                                       
25     8,938,096         1,336,098           692,189          36,652                                                      
26    19,471,877         1,899,863         1,043,753          71,646                                                          
27    41,222,135         3,892,530         1,621,566         133,395                                                          
28    78,738,034         7,918,450         2,669,449         177,470                                                             
Regards,

Zen
Only 2 defining forces have ever offered to die for you.....Jesus Christ and the American Soldier. One died for your soul, the other for your freedom.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Ok, Maybe I didn't make it clear in previous posts...

Post by bob »

I hardly call 1.72 to 1.9 a "major tipping point."

But what you are NOT showing is that your idea improves on this. Makes no sense to say "once the table is seriously oversubscribed you have a problem.." but then don't show how your idea supposedly solves that.

Until some real data is shown.

As a worst case, I suppose I could do a second ttable as you suggest. You simply tell me (a) size of main table; (b) size of small table; (c) where to switch from accessing main to accessing the small table. But I don't want to spend days working on this... If you are convinced, give me precise specifics and I can easily confirm or disprove this...
Zenmastur
Posts: 919
Joined: Sat May 31, 2014 8:28 am

Re: Ok, Maybe I didn't make it clear in previous posts...

Post by Zenmastur »

bob wrote:I hardly call 1.72 to 1.9 a "major tipping point."

But what you are NOT showing is that your idea improves on this. Makes no sense to say "once the table is seriously oversubscribed you have a problem.." but then don't show how your idea supposedly solves that.

Until some real data is shown.

As a worst case, I suppose I could do a second ttable as you suggest. You simply tell me (a) size of main table; (b) size of small table; (c) where to switch from accessing main to accessing the small table. But I don't want to spend days working on this... If you are convinced, give me precise specifics and I can easily confirm or disprove this...
I'm still doing some testing with large TT's. These lead to long search times so it will be a couple of days. I don't want to commit to a spec until I'm done with these tests.

Regards,

Forrest
Only 2 defining forces have ever offered to die for you.....Jesus Christ and the American Soldier. One died for your soul, the other for your freedom.
Zenmastur
Posts: 919
Joined: Sat May 31, 2014 8:28 am

Re: Ok, Maybe I didn't make it clear in previous posts...

Post by Zenmastur »

bob wrote:I hardly call 1.72 to 1.9 a "major tipping point."

But what you are NOT showing is that your idea improves on this. Makes no sense to say "once the table is seriously oversubscribed you have a problem.." but then don't show how your idea supposedly solves that.

Until some real data is shown.

As a worst case, I suppose I could do a second ttable as you suggest. You simply tell me (a) size of main table; (b) size of small table; (c) where to switch from accessing main to accessing the small table. But I don't want to spend days working on this... If you are convinced, give me precise specifics and I can easily confirm or disprove this...
When I originally designed this I had certain specific scenario's in mind. One being Stockfish running at 15Mnps for a minimum search of 8 hour and maximum search of about a week. An 8-hour search at 15Mnps is about 432 billion nodes all of which will be written to the TT. Stockfish with a 32Gb TT will have 3 billion entries. To highly oversubscribe a 3 billion entry TT require 72 billion unique nodes. In a search of 432 billion nodes more than 83% of a 432 billion nodes will be done with the TT in a highly over-subscribe condition. So I think 83% of the nodes from any test should be into a highly oversubscribed TT as well (i.e. 5 out 6 nodes).

Crafty does things a little different than stockfish and I'm wonder what effect this will have on the proposed solution. When I tested short searches using stockfish I never saw large numbers of nodes in the first few plies. Not so with crafty, I've seen 1 ply searches exceed 20k nodes. I assume the difference is probing in the q-search. I'm not sure how much of a problem this could be. One thing is for sure, Crafty won't reach the same depth of search as stockfish running on the same hardware in equal time searches.

I was counting on a search depth of 45-plies or more. Directing 4-plies into a L1 TT on a 45-ply search is about 9% of total search depth. Crafty won't reach these depths with out running extremely long searches. So to keep approximately the same percentage of the search depth directed to the L1 TT ONLY 3-plies should be directed to the L1 TT in a 33 or 34 ply search. This should also compensate for the fact that crafty doesn't use it's TT as heavily because it doesn't probe in the q-search. I'm afraid that if 4-plies are directed to the L1 TT then with abbreviated searches the L2 TT may be less than 1% utilized at the end of the search. This isn't a very efficient use of the L2 TT. Optimally it should be in the 10% to 50% range.

I'm not sure about aging in the L1 TT. The entries do need some sort of aging and it can't be based on iteration number as this will be way too course. It should probably based on number of times the local thread has entered in to the last 4 plies of a search.

At least one node should be written back to the L2 TT from the L1 TT every time the search bridges the gap between the L1 and L2 TTs. This should be a node with N-1 plies remaining, where N is the number of plies directed to the L1 TT (2 in this particular case.)

I don't know how detailed you need the specifications, so if you need more specifics than what follows, please ask.

L1 TT's 8,192 entries each, one per thread. Assuming 20 threads the total size of all L1 TT's is 163,840 entries.

L2 TT, the “normal” TT is 16,777,216 entries. So the total L1 size is approximately 1% of the L2 TT size. This size is small enough that testing times shouldn't be too long. If you don't mind longer testing time the L2 TT can be made as large as you wish

I assume you don't want the added programming changes to get 8 entries per bucket so I'm going to assume 16-byte entries, 4 per bucket. This isn't optimal but it should still work and it saves additional coding changes.

A 16 million node TT is highly over-subscribed at about 287 million unique nodes. We want approximately 5 out of 6 nodes entering a highly oversubscribed TT so 6 times 287 million means we need a search length that will produce approximately 1,722 million unique nodes. The “normal” node count that crafty puts out can't be used to determine when this number has been reached because it counts nodes that aren't probed and of course if you decide on a large L2 TT then this number will need to be recalculated.

It's best to modify Crafty so it doesn't count q-search nodes so that the minimum number of plies needed to reach the the given node count can be determined in advance of any test. This pre-test should be done on each position to be tested with the largest TT size possible so as to avoid recording “false” unique nodes due to overwriting and then counting the same node again in the next iteration etc.

The only significant thing I found from large TT's is that they are better at finding transpositions near the leaf nodes than I had originally thought. But this wasn't totally unexpected and they do so at the expense of finding transposition closer to the root.

Regards,

Forrest
Only 2 defining forces have ever offered to die for you.....Jesus Christ and the American Soldier. One died for your soul, the other for your freedom.