Some musings about search

Discussion of chess software programming and technical issues.

Moderators: hgm, Harvey Williamson, bob

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
User avatar
Rebel
Posts: 4268
Joined: Thu Aug 18, 2011 10:04 am

Some musings about search

Post by Rebel » Fri Aug 14, 2015 3:40 pm

After a search change I am (was) used to get an impression first by manually going through 40-50 positions before starting a self-play match.

And sometimes that first glimpse looked so good it was reasonable to assume the change to be an improvement. And then after playing 10,000 x 40/15s games it turned out not to be, say an 48% result as an example.

Then stubbornly I couldn't believe it (not a bad attitude in CC) and started to play the match on 40/30s and even 40/1m.

And I never have seen such a change to become an improvement after doubling or four folding the time control.

Is this a global experience?

User avatar
cdani
Posts: 2095
Joined: Sat Jan 18, 2014 9:24 am
Location: Andorra
Contact:

Re: Some musings about search

Post by cdani » Fri Aug 14, 2015 5:20 pm

In king safety yes, and many times, but my first testing time was like 7 seconds, and the second one at 25 seconds, but always against a gauntlet.

I don't try manually, the impression is just too shallow. If I try manually is to try to see weaknesses.

My testing time control is always like this, 4-7 seconds at first, and 25 for second.

User avatar
Bloodbane
Posts: 154
Joined: Thu Oct 03, 2013 2:17 pm

Re: Some musings about search

Post by Bloodbane » Fri Aug 14, 2015 6:09 pm

I've seen some patches which were bad at short TC but good at long TC, but I only accept patches which are good at all time controls I use.
Functional programming combines the flexibility and power of abstract mathematics with the intuitive clarity of abstract mathematics.
https://github.com/mAarnos

Ferdy
Posts: 3645
Joined: Sun Aug 10, 2008 1:15 pm
Location: Philippines

Re: Some musings about search

Post by Ferdy » Fri Aug 14, 2015 7:02 pm

Rebel wrote:After a search change I am (was) used to get an impression first by manually going through 40-50 positions before starting a self-play match.

And sometimes that first glimpse looked so good it was reasonable to assume the change to be an improvement. And then after playing 10,000 x 40/15s games it turned out not to be, say an 48% result as an example.

Then stubbornly I couldn't believe it (not a bad attitude in CC) and started to play the match on 40/30s and even 40/1m.

And I never have seen such a change to become an improvement after doubling or four folding the time control.

Is this a global experience?
If a change is interesting (feeling would kick at higher depths <after so many failed tests you also learn something from the program>) I increase testing time,
instead of testing first at short TC I go directly at longer TC say 120s + 100ms inc. Sometimes it succeeded.
Probably the field is generally contested at typical CCRL 40/4 and 40/40, I try to look good at 40/4.

What do you mean by the following?
manually going through 40-50 positions before starting a self-play match

User avatar
Rebel
Posts: 4268
Joined: Thu Aug 18, 2011 10:04 am

Re: Some musings about search

Post by Rebel » Fri Aug 14, 2015 10:03 pm

Ferdy wrote: Probably the field is generally contested at typical CCRL 40/4 and 40/40, I try to look good at 40/4.
One thing I learned from my last active period (2012-2013) is that playing too few games can be disastrous. In principle playing 1000-2000 games in most cases is good enough, (say) 9 of 10 times. But it is the 10th time that is going to hurt, you get good results while if you had played more you would have known the change isn't an improvement at all. Yet you count the change as an improvement and the damage is done, from version to version.

It's (I think) the reason why in the 80's, 90's and early 00's less progress was made than nowadays, lack of sufficient hardware.

So unless you have access to 200-300 processors (or so) it's impossible to play on CCRL level. I currently have set the limit to 12,000 games.

What do you mean by the following?
manually going through 40-50 positions before starting a self-play match
Going through a (fixed) testset first to get an impression of the change you made. Isn't conclusive but at least it avoids obvious bugs.

User avatar
cdani
Posts: 2095
Joined: Sat Jan 18, 2014 9:24 am
Location: Andorra
Contact:

Re: Some musings about search

Post by cdani » Fri Aug 14, 2015 10:16 pm

Bloodbane wrote:I've seen some patches which were bad at short TC but good at long TC, but I only accept patches which are good at all time controls I use.
Really curious. I always accept those patches, as they are even better at longer time controls, not by much, of course.

User avatar
cdani
Posts: 2095
Joined: Sat Jan 18, 2014 9:24 am
Location: Andorra
Contact:

Re: Some musings about search

Post by cdani » Fri Aug 14, 2015 10:21 pm

Rebel wrote: One thing I learned from my last active period (2012-2013) is that playing too few games can be disastrous. In principle playing 1000-2000 games in most cases is good enough, (say) 9 of 10 times. But it is the 10th time that is going to hurt, you get good results while if you had played more you would have known the change isn't an improvement at all. Yet you count the change as an improvement and the damage is done, from version to version.
Some days ago it happened to me that three computers at something like 2000 games each already played, so 6000 games, where at +10 elo for a patch in every one of the three computers. I was tempted to stop the test and give it as good, but I let it continue. When the total games where at 20000, the patch clearly showed as a regression!

User avatar
Rebel
Posts: 4268
Joined: Thu Aug 18, 2011 10:04 am

Re: Some musings about search

Post by Rebel » Fri Aug 14, 2015 10:52 pm

cdani wrote:
Rebel wrote: One thing I learned from my last active period (2012-2013) is that playing too few games can be disastrous. In principle playing 1000-2000 games in most cases is good enough, (say) 9 of 10 times. But it is the 10th time that is going to hurt, you get good results while if you had played more you would have known the change isn't an improvement at all. Yet you count the change as an improvement and the damage is done, from version to version.
Some days ago it happened to me that three computers at something like 2000 games each already played, so 6000 games, where at +10 elo for a patch in every one of the three computers. I was tempted to stop the test and give it as good, but I let it continue. When the total games where at 20000, the patch clearly showed as a regression!
Yep.

And these kind of things happened more than I like. In 2012/13 I started to replay (usually 2000 x 40/1m) matches that gave a positive result and often the gain disappeared as snow for the sun.

User avatar
Laskos
Posts: 8688
Joined: Wed Jul 26, 2006 8:21 pm

Re: Some musings about search

Post by Laskos » Sat Aug 15, 2015 1:09 am

Rebel wrote:
Ferdy wrote: Probably the field is generally contested at typical CCRL 40/4 and 40/40, I try to look good at 40/4.
One thing I learned from my last active period (2012-2013) is that playing too few games can be disastrous. In principle playing 1000-2000 games in most cases is good enough, (say) 9 of 10 times. But it is the 10th time that is going to hurt, you get good results while if you had played more you would have known the change isn't an improvement at all. Yet you count the change as an improvement and the damage is done, from version to version.
That's why SPRT framework is important. Or, if too cumbersome, keep 3 standard deviations at the stop of your choosing. Not 2, at least 3.
It's (I think) the reason why in the 80's, 90's and early 00's less progress was made than nowadays, lack of sufficient hardware.
People back then tested either in some ridiculously small amount of pretty long games, or on testsuites, which are misleading. The necessary hardware tools were available back then too, the clock time had the same 15ms granularity, but the testing proceeded amateurishly, and little attention was paid to ultra-fast games.
So unless you have access to 200-300 processors (or so) it's impossible to play on CCRL level. I currently have set the limit to 12,000 games.

What do you mean by the following?
manually going through 40-50 positions before starting a self-play match
Going through a (fixed) testset first to get an impression of the change you made. Isn't conclusive but at least it avoids obvious bugs.

jdart
Posts: 3663
Joined: Fri Mar 10, 2006 4:23 am
Location: http://www.arasanchess.org

Re: Some musings about search

Post by jdart » Sat Aug 15, 2015 3:35 am

I used test suites for tuning for years. It is probably better than guesswork. But it is not a reliable method in general.

That said, I have actually been looking at some test results very recently to at least select some interesting modifications for further testing. But I don't regard the test results as conclusive, just a possible indicator.


--Jon

Post Reply