Storing Upper & Lower Bound in TT?

hgm · Post by **hgm** » Tue Feb 21, 2023 11:05 am

syzygy wrote: ↑Mon Feb 20, 2023 4:03 amBound-flipping could also be prevented by not storing lower bounds (or upper bounds) at all in such endgame positions.

True, but what is the price? It would mean you would never get a hash cut in an all-node, and would have to always search them. Of course you would then get the hash cutoffs in the daughter nodes, if that part of the tree was in the TT. But, being an all node, you would have to do some 30-40 TT probes to get the hash cutoff in each daughter, where a single TT probe would have sufficed to cause the cutoff in the all node itself if the lower-bound had been available. This sounds costly.

While the alternative, storing dual bounds, is virtually free. It just requires an extra score and depth in the TT entry, while the bound-type flags could be eliminated. So your entry gets a bit larger. Given that search speed typically scales like the 12th root of the TT size, the resuting slowdown would be almost negligible.

syzygy · Post by **syzygy** » Tue Feb 21, 2023 9:38 pm

hgm wrote: ↑Tue Feb 21, 2023 11:05 am
syzygy wrote: ↑Mon Feb 20, 2023 4:03 amBound-flipping could also be prevented by not storing lower bounds (or upper bounds) at all in such endgame positions.
True, but what is the price? It would mean you would never get a hash cut in an all-node, and would have to always search them. Of course you would then get the hash cutoffs in the daughter nodes, if that part of the tree was in the TT. But, being an all node, you would have to do some 30-40 TT probes to get the hash cutoff in each daughter, where a single TT probe would have sufficed to cause the cutoff in the all node itself if the lower-bound had been available. This sounds costly.

I agree it is almost certainly too costly in general, but it could be limited to positions where it makes sense.

While the alternative, storing dual bounds, is virtually free. It just requires an extra score and depth in the TT entry, while the bound-type flags could be eliminated. So your entry gets a bit larger.

Bigger TT entries and thus fewer entries in total is quite a big cost.

Given that search speed typically scales like the 12th root of the TT size, the resuting slowdown would be almost negligible.

It would have to be tested. If it only helps in special positions, which I believe is the case, it will be a net loss.

Pio · Post by **Pio** » Tue Feb 21, 2023 11:46 pm

hgm wrote: ↑Tue Feb 21, 2023 11:05 am
syzygy wrote: ↑Mon Feb 20, 2023 4:03 amBound-flipping could also be prevented by not storing lower bounds (or upper bounds) at all in such endgame positions.
True, but what is the price? It would mean you would never get a hash cut in an all-node, and would have to always search them. Of course you would then get the hash cutoffs in the daughter nodes, if that part of the tree was in the TT. But, being an all node, you would have to do some 30-40 TT probes to get the hash cutoff in each daughter, where a single TT probe would have sufficed to cause the cutoff in the all node itself if the lower-bound had been available. This sounds costly.

While the alternative, storing dual bounds, is virtually free. It just requires an extra score and depth in the TT entry, while the bound-type flags could be eliminated. So your entry gets a bit larger. Given that search speed typically scales like the 12th root of the TT size, the resuting slowdown would be almost negligible.

Another idea is to include if upper or lower bound in the hash calculation and store the different types as different entries. In that way you don’t waste so much space. It will use both bounds when needed and will not waste space if only one type of bound has been determined.

hgm · Post by **hgm** » Wed Feb 22, 2023 8:53 am

syzygy wrote: ↑Tue Feb 21, 2023 9:38 pmBigger TT entries and thus fewer entries in total is quite a big cost.

...

It would have to be tested. If it only helps in special positions, which I believe is the case, it will be a net loss.

Not really, as I already mentioned. If the entry size goes from 10 to 12 bytes you would have 20% fewer entries, which would translate to 20/12 = 1.7% slowdown, which again would translate to 1.7 Elo if you assume every speed doubling is worth 70 Elo. And 1.7 Elo would translate to a result change of 0.24%. So even if this was an absolutely useless feature, it would only significantly alter the result of a match the engine engages in if that match had more than 28k games.

I would not consider that 'a big cost'.

I also strongly doubt that you could ever test for such a small Elo difference. It seems to me that the error bars due to selection of the opponents you pick in your test will always be far larger.

Rein Halbersma · Post by **Rein Halbersma** » Wed Feb 22, 2023 1:10 pm

syzygy wrote: ↑Sun Feb 19, 2023 6:46 pm
Steve Maughan wrote: ↑Sun Feb 19, 2023 5:51 pmIt's possible that, in the vast majority of cases, only one bound is ever relevant so the other bound is simply taking up space.
I think this is correct.

Intuitively, I would say that, for almost all nodes, the reason a node is in the small fraction of nodes visited by alpha-beta is either to be a cut-node (to refute its parent) or to be an all-node (I guess to help the parent refute its parent). I suspect only very few nodes ever play both roles in the same search (and that should be nodes "close" to the PV). However, this intuition may be misguided and in any event cannot replace real testing.

I seem to remember reading that MTD(f) engines benefit from both bounds being stored. I doubt that storing both bounds is worth it even for MTD(f) engines, but it means there might be a few more engines than Fruit that do it.

MTD(f) inventor Aske Plaat's explanation and pseudo-code uses two bounds: https://askeplaat.wordpress.com/534-2/mtdf-algorithm/

Rein Halbersma · Post by **Rein Halbersma** » Wed Feb 22, 2023 1:12 pm

syzygy wrote: ↑Sun Feb 19, 2023 7:14 pm I found this old thread: Hash tables: one bound or two bounds?

Two bounds probably help in endgame positions with very few pieces (or rather very few total positions and therefore all kinds of deep transpositions).

The threads also mentions that two bounds are required for MTD(f) to work because MTD(f) continuously switches between proving an upper bound for the root position and proving a lower bound for the root position.
This argument does not convince me. The minimal tree that proves the upper bound for a position has almost no overlap with the minimal tree that proves the lower bound for the same position.

Maybe two bounds are useful for nodes at distance 1 from the PV (I would have to think about this and draw some trees on paper).
But even if this is the case, the necessary information for these nodes will normally be found at distance 2 from the PV.

(MTD(f) does not set out to calculate a PV, but when the upper bound and lower bound for the root node meet, they meet on a sequence of moves that forms the PV.)

Fabien Letouzey once posted how to extract a PV using MTD(f) on CCC (> 20 years ago by now): https://www.stmintz.com/ccc/index.php?id=251543

syzygy · Post by **syzygy** » Fri Feb 24, 2023 9:15 pm

Pio wrote: ↑Tue Feb 21, 2023 11:46 pm Another idea is to include if upper or lower bound in the hash calculation and store the different types as different entries. In that way you don’t waste so much space. It will use both bounds when needed and will not waste space if only one type of bound has been determined.

This is not a bad idea at all.
To avoid expensive extra cache misses, I would place the two entries in the same bucket/cache line.

Pio · Post by **Pio** » Fri Feb 24, 2023 9:21 pm

syzygy wrote: ↑Fri Feb 24, 2023 9:15 pm
Pio wrote: ↑Tue Feb 21, 2023 11:46 pm Another idea is to include if upper or lower bound in the hash calculation and store the different types as different entries. In that way you don’t waste so much space. It will use both bounds when needed and will not waste space if only one type of bound has been determined.
This is not a bad idea at all.
To avoid expensive extra cache misses, I would place the two entries in the same bucket/cache line.

Yes, that was what I had in mind as well. You just need to make sure it wraps around at the end

syzygy · Post by **syzygy** » Fri Feb 24, 2023 9:36 pm

hgm wrote: ↑Wed Feb 22, 2023 8:53 am I also strongly doubt that you could ever test for such a small Elo difference. It seems to me that the error bars due to selection of the opponents you pick in your test will always be far larger.

I don't have the numbers now, but I am quite sure a 1.7 Elo improvement would pass pretty quickly in fishtest.

Years ago SF went from 4 positions per 64-byte bucket to 3 positions per 32-byte bucket. This came at the cost of about 1 in 60000 undetected false hits (only 16 bits used to verify the position) but it easily passed all the tests.

Btw, it seems to me adding a 2nd bound would increase entries by 3 bytes (and thus probably by 4 bytes).
If it allowed SF to go back to 4x16 byte entries per bucket with decent collision detection and no loss in Elo, I would be very much for it. I don't expect it to work, but it would be interesting to see it tested.

syzygy · Post by **syzygy** » Fri Feb 24, 2023 10:08 pm

Rein Halbersma wrote: ↑Wed Feb 22, 2023 1:12 pm
syzygy wrote: ↑Sun Feb 19, 2023 7:14 pm I found this old thread: Hash tables: one bound or two bounds?

Two bounds probably help in endgame positions with very few pieces (or rather very few total positions and therefore all kinds of deep transpositions).

The threads also mentions that two bounds are required for MTD(f) to work because MTD(f) continuously switches between proving an upper bound for the root position and proving a lower bound for the root position.
This argument does not convince me. The minimal tree that proves the upper bound for a position has almost no overlap with the minimal tree that proves the lower bound for the same position.

Maybe two bounds are useful for nodes at distance 1 from the PV (I would have to think about this and draw some trees on paper).
But even if this is the case, the necessary information for these nodes will normally be found at distance 2 from the PV.

(MTD(f) does not set out to calculate a PV, but when the upper bound and lower bound for the root node meet, they meet on a sequence of moves that forms the PV.)
Fabien Letouzey once posted how to extract a PV using MTD(f) on CCC (> 20 years ago by now): https://www.stmintz.com/ccc/index.php?id=251543

He is working with a fail-low and fail-high "semi-PVs". The part where they agree can be taken as the PV.

I wonder if any of the top engines is still using MTD(f). Don Dailey seems to have been a fan of it, so maybe Komodo? Probably not.
- At least in SF PV nodes are treated very differently from non-PV nodes. So it is crucial to know whether a node is a PV node or not.
- One premise of MTD(f) is that the search is stable. SF taught us to stop worrying and love search instability. (Not search instability for the sake of it, but extensions, reductions and other search techniques are implemented without regard for consistency, e.g. there are all kinds of history, path and PV/non-PV dependencies, and that is just fine.)

Storing Upper & Lower Bound in TT?

Re: Storing Upper & Lower Bound in TT?

Re: Storing Upper & Lower Bound in TT?

Re: Storing Upper & Lower Bound in TT?

Re: Storing Upper & Lower Bound in TT?

Re: Storing Upper & Lower Bound in TT?

Re: Storing Upper & Lower Bound in TT?

Re: Storing Upper & Lower Bound in TT?

Re: Storing Upper & Lower Bound in TT?

Re: Storing Upper & Lower Bound in TT?

Re: Storing Upper & Lower Bound in TT?