How much are bitbases worth?

Andres Valverde · Post by **Andres Valverde** » Fri May 23, 2008 10:44 am

Tord Romstad wrote:I've started thinking about adding 4-piece and 5-piece bitbases to my program, but I hesitate a little because it seems to be an awful lot of work, especially for something that doesn't scale down to handheld devices.

Therefore, before I start: Has there been any experiments investigating how much you gain by using the complete 4-piece and 5-piece bitbases on modern hardware?

Tord

In my EveAnn & Dirty tests i got about +20 ELO with 5 men. I guess the difference would be smaller with strongest engines.

wgarvin · Post by **wgarvin** » Fri May 23, 2008 4:37 pm

Don wrote:Actually, you could combine the two approaches, both yours and mine!

Suppose you have a heuristic function that is pretty good at returning the correct answer most of the time. All you need to know is when it's wrong. A bloom filter is a good choice for this. Bloom filters occasionally return a false positive but an exception list can be constructed to deal with that. A trick is to put the exceptions themselves in a much smaller bloom filter which can be an order of magnitude smaller.

The appeal of the bloom filter approach is that you can probably use your zobrist hash keys directly.

My hidden doubts concern Run Length Encoding which is typically used to compress end game databases. RLE can be very effective because positions tend to cluster into large chains of wins, losses or draws and thus may be more effective than bloom filters.

Bloom filters have the advantage of utter simplicity and are compact since you don't have to store the key. But it's not as compact as directly indexing an endgame database that returns only 2 states, for instance win or draw.

I'm sorely tempted to try it. As a feasibility test, I could try it on KpK just to see if I can get 90% of the positions correct with just 2 or 3 lines of code. I am estimating that this would reduce a 24K database to about 5k.

What is appealing about this idea is that it is a very simple methodology for making the time vs space trade off. You write the best evaluator you can, and the bloom filter does the rest of the work for you!

That sounds pretty interesting! I'm not sure if KPK would be a very good test, though. For one thing, isn't KPK simple enough to describe completely with heuristic functions? It might be better to try it on a 5-man database.

Also, what is the cost of evaluating the bloom filter? The larger the value of k, the more probes you would need (and the more hash values you would need... only small values of k could be satisfied by using directly the bits of the zobrist key). Using zobrist keys directly means giving up the benefits of symmetry, but it might still be worth it. [Edit: you might still be able to exploit one symmetry by keeping two zobrist keys for the position--one for the actual position and one for the "horizontally mirrored" position. Just double the width of the zobrist lookup table entries and keep both keys in it, and use SSE or altivec to xor keypairs. Then whenever you decide to mirror the board horizontally, just swap the two keys.]

Here's a table of false-positive rates. It doesn't look that bad, especially if the elements are sparse... a small value of k might be fine.

Tord Romstad · Post by **Tord Romstad** » Fri May 23, 2008 7:07 pm

Don wrote:My sense of this is that there is not that much benefit except to one of them in particular, the RP vs R database. I would say 90% of the benefit of 5 piece databases comes from this single database.

Are you talking about benefits for programs without any specific code for different classes of engines here? My guess is that KRP vs KR isn't tremendously important. The rules you find for this endgame in elementary textbooks cover most of the positions that occur in practice, and these rules are very easy to translate to actual code.

It seems to me that KQP vs KQ, KQ vs KRP and KR vs KPP should all be more important than KRP vs KR.

There are small databases that are important but they are easily contained in a small amount of memory. KPk is probably the most important of these. I built my own kpk in the late 80's and still use it and I assume most programs have this one.

That's the only one I currently use. I generate KPK during program initialization.

The problem with 5 piece databases are that despite herculean efforts, they do impact the speed of your program.

Even if you keep them in RAM, and only probe when the remaining depth is big? How big is the approximate slowdown?

It may very well be the case that with modern programs these databases become much more important (and thus my guesses could be off.) I believe that at really high ELO levels you need to have some knowledge that is more commensurate with your ability - for instance it's not very important whether a raw beginner understands how to win with RP vs P when he hangs pieces all the time, but it's very important for a master to understand this. So that concept may be valid here.

I suspect that the opposite may be true: Strong programs have so good evaluation functions that they evaluate basic endgames correctly without bitbases. Adding bitbases improves the evaluation from almost perfect to perfect. For weak programs, adding bitbases improves the evaluation from very bad to perfect, which should be more significant.

Tord

Tord Romstad · Post by **Tord Romstad** » Fri May 23, 2008 7:10 pm

Andres Valverde wrote:In my EveAnn & Dirty tests i got about +20 ELO with 5 men. I guess the difference would be smaller with strongest engines.

Thanks, that's a very useful data point!

If I can get anything close to 20 Elo points, it's probably worth doing. If the weather turns worse, maybe I'll give it a try this weekend...

Tord

How much are bitbases worth?

Re: How much are bitbases worth?

Re: How much are bitbases worth?

Re: How much are bitbases worth?

Re: How much are bitbases worth?