Tuning search parameters

jmcd · Post by **jmcd** » Wed Dec 07, 2022 5:42 am

I've read about people using test positions to find strategic weaknesses in their engine, and I think its about time for me to try this out. I've seen some people here have some success with "IQ Test". I have some questions on how you actually address these strategic weaknesses though, and how much you can rely on test scores to improve your engine.

My main question is, can you use a set of test positions to tune your search parameters the same way you can for evaluation? I assume that if you take a bunch of positions that only have one clear best move, and then give your engine limited time to search them all, you can get a error that you minimize by changing things like the depths that certain heuristics are applied and such.

The main problem I see in this is that it is so obvious that there must either be something glaringly stupid about the idea, or it is something everybody already does and I am simply unaware of it. Can somebody enlighten me on this?

expositor · Post by **expositor** » Wed Dec 07, 2022 8:35 am

I've never tried this, but I remember the Mantissa dev doing something similar. Here's what I recall, though I may be wrong:

He found several hundred thousand¹ positions that Mantissa blundered in self-play games – positions which had an only move² that Mantissa missed – and then tuned to maximize the number that she could play correctly. It didn't have any positive effect on general playing strength; his takeaway was that either the technique simply didn't work³ or that it would require much more data.

1 Maybe it was only tens of thousands? but I think more than that.
2 Determined by using several other engines, which had to unanimously agree.
3 Although unusual, perhaps it wouldn't be terribly surprising; puzzle strength, for example, seems to correlate more weakly with general playing strength than one might expect.

Actually, I suppose we should just ask him. @jtwright did I get all that (somewhat) right?

derjack · Post by **derjack** » Wed Dec 07, 2022 9:29 am

There is (quite oldish in todays standards

) paper how they used genetic algorithms to evolve evaluation function and tune search parameters. https://arxiv.org/abs/1711.08337

They used several thousands (low in todays standards) tactical positions and the fitness was the number of nodes needed to achieve desired move. At time, they had some moderate success.

syzygy · Post by **syzygy** » Wed Dec 07, 2022 2:06 pm

jmcd wrote: ↑Wed Dec 07, 2022 5:42 am I've read about people using test positions to find strategic weaknesses in their engine, and I think its about time for me to try this out. I've seen some people here have some success with "IQ Test". I have some questions on how you actually address these strategic weaknesses though, and how much you can rely on test scores to improve your engine.

My main question is, can you use a set of test positions to tune your search parameters the same way you can for evaluation? I assume that if you take a bunch of positions that only have one clear best move, and then give your engine limited time to search them all, you can get a error that you minimize by changing things like the depths that certain heuristics are applied and such.

The main problem I see in this is that it is so obvious that there must either be something glaringly stupid about the idea, or it is something everybody already does and I am simply unaware of it. Can somebody enlighten me on this?

I don't see a fundamental difference between tuning evaluation parameters and search parameters by running an engine on a set of test positions.

However, tuning an engine on test positions will only optimise the engine for a specific set of positions and not for playing real games. The best way to tune an engine is by letting it play many (fast) games (typically against its unmodified self).

Tuning search parameters

Tuning search parameters

Re: Tuning search parameters

Re: Tuning search parameters

Re: Tuning search parameters