The problem is that you might need other knowledge at unrealistically low depth than otherwise. E.g. Pawn-push bonuses tend to get highly exaggerated when you don't allow promotions to be within the horizon.Rebel wrote:Excellent to test search issues, eval IMO by fixed depth. Knowledge vs Knowledge not influenced by search randomness.michiguel wrote:I limit my search by nodes and I have been happy thereafter. I enthusiastically recommend it.
Tuning again
Moderators: hgm, Rebel, chrisw
-
- Posts: 27808
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: Tuning again
-
- Posts: 1154
- Joined: Fri Jun 23, 2006 5:18 am
Re: Tuning again
If you are looking for elo, Fixed depth search is MUCH worse than node based in my experience. Eval has a huge effect on the search tree. HUGE. This is missed in depth based cutoffs but not in timed or node based. When I switched from depth to node based optimization, my evaluation optimization results markedly improved.Rebel wrote:Excellent to test search issues, eval IMO by fixed depth. Knowledge vs Knowledge not influenced by search randomness.michiguel wrote:I limit my search by nodes and I have been happy thereafter. I enthusiastically recommend it.
-Sam
-
- Posts: 6993
- Joined: Thu Aug 18, 2011 12:04 pm
Re: Tuning again
The best generation everhgm wrote:That was how Usurpator II did it in the eighties! I had not learned the blessings of iterative deepening yet, in those days.Rebel wrote:1. introduce a special parameter for internal testing;
2. when the flag is on increase the depth with 1 when queens are exchanged;
3. depth+2 entering the endgame
4. depth+5 entering the simple endgame
-
- Posts: 3697
- Joined: Tue Jul 31, 2007 4:26 pm
Re: Tuning again
Exactly..Rebel wrote:The best generation everhgm wrote:That was how Usurpator II did it in the eighties! I had not learned the blessings of iterative deepening yet, in those days.Rebel wrote:1. introduce a special parameter for internal testing;
2. when the flag is on increase the depth with 1 when queens are exchanged;
3. depth+2 entering the endgame
4. depth+5 entering the simple endgame
It was the era of the dedicated computers
The Mephisto Academy Sends Its Regards
Steve
-
- Posts: 900
- Joined: Tue Apr 27, 2010 3:48 pm
Re: Tuning again
I suppose that's good for parameter tuning but not for bigger eval changes (due to it not accounting for eval speed).michiguel wrote:I limit my search by nodes and I have been happy thereafter. I enthusiastically recommend it.Rebel wrote:Rebel wrote: Eval tuning I strictly do at fixed depth. I don't want external factors like time control or permanent brain to interfere. Enough volume will flatten all the horizon effects eventually, both sides.For self-play-ply-depth testing I am planning the following:mcostalba wrote:IMHO the main drawbacks are: impossible to test depth sensible stuff like king safety and artificial same depth for midgame and endgame. But I agree for some evaluation parameters could be good, actually I will give it a try.
1. introduce a special parameter for internal testing;
2. when the flag is on increase the depth with 1 when queens are exchanged;
3. depth+2 entering the endgame
4. depth+5 entering the simple endgame
Or something like that.
Miguel
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: Tuning again
The other issue is one I have pointed out repeatedly. If your program speeds up (or slows down) in nps in a certain phase of the game, you would normally search deeper (if it speeds up, for example). But if you slow down, say in a complicated attacking position, a fixed node count makes it appear you do not. And as a result, you can tune your program to try to reach those kinds of positions, where you do better because you are not getting penalized by slowing down when doing a fixed node test. That NPS variation adds a new variable that is not obvious, and tuning against that is not always a good idea. I've tried both fixed depth, and fixed nodes. Each has places where they work reasonably. But NOTHING replaces using time, overall, because that is how you actually have to play the game, and tuning like you play is much safer overall...rbarreira wrote:I suppose that's good for parameter tuning but not for bigger eval changes (due to it not accounting for eval speed).michiguel wrote:I limit my search by nodes and I have been happy thereafter. I enthusiastically recommend it.Rebel wrote:Rebel wrote: Eval tuning I strictly do at fixed depth. I don't want external factors like time control or permanent brain to interfere. Enough volume will flatten all the horizon effects eventually, both sides.For self-play-ply-depth testing I am planning the following:mcostalba wrote:IMHO the main drawbacks are: impossible to test depth sensible stuff like king safety and artificial same depth for midgame and endgame. But I agree for some evaluation parameters could be good, actually I will give it a try.
1. introduce a special parameter for internal testing;
2. when the flag is on increase the depth with 1 when queens are exchanged;
3. depth+2 entering the endgame
4. depth+5 entering the simple endgame
Or something like that.
Miguel
-
- Posts: 900
- Joined: Tue Apr 27, 2010 3:48 pm
Re: Tuning again
You are definitely right that in an ideal situation, testing with time is the best approach. But I can see why people would test with fixed nodes, especially in situations where the computer running the test is doing other things which might affect program speed, favoring one or another program under test (along with other advantages others mentioned like being able to merge together results from different hardware).bob wrote:The other issue is one I have pointed out repeatedly. If your program speeds up (or slows down) in nps in a certain phase of the game, you would normally search deeper (if it speeds up, for example). But if you slow down, say in a complicated attacking position, a fixed node count makes it appear you do not. And as a result, you can tune your program to try to reach those kinds of positions, where you do better because you are not getting penalized by slowing down when doing a fixed node test. That NPS variation adds a new variable that is not obvious, and tuning against that is not always a good idea. I've tried both fixed depth, and fixed nodes. Each has places where they work reasonably. But NOTHING replaces using time, overall, because that is how you actually have to play the game, and tuning like you play is much safer overall...rbarreira wrote:I suppose that's good for parameter tuning but not for bigger eval changes (due to it not accounting for eval speed).michiguel wrote:I limit my search by nodes and I have been happy thereafter. I enthusiastically recommend it.Rebel wrote:Rebel wrote: Eval tuning I strictly do at fixed depth. I don't want external factors like time control or permanent brain to interfere. Enough volume will flatten all the horizon effects eventually, both sides.For self-play-ply-depth testing I am planning the following:mcostalba wrote:IMHO the main drawbacks are: impossible to test depth sensible stuff like king safety and artificial same depth for midgame and endgame. But I agree for some evaluation parameters could be good, actually I will give it a try.
1. introduce a special parameter for internal testing;
2. when the flag is on increase the depth with 1 when queens are exchanged;
3. depth+2 entering the endgame
4. depth+5 entering the simple endgame
Or something like that.
Miguel
-
- Posts: 27808
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: Tuning again
You can of course simply count nodes in such a way that your program does not speed up. How you count nodes is pretty much a matter of taste anyway. Some people count MakeMoves, other count MoveGens, still other count Evals. Would you count nodes that are hash pruned or not, etc.
-
- Posts: 227
- Joined: Mon Sep 12, 2011 11:27 pm
- Location: Moscow, Russia
Re: Tuning again
Hi!
Let me to insert one more coin in this theme)
I think, we should use such a principles in tuning:
1. Use nodes limit per move instead of time limit (more stability)
2. Use fixed set of starting positions (I suppose huge, but strictly predefined) (more stability)
Let's assume that we have several "species" with different current ELO value and number of games played.
At every next step we should find a pair of "species" with the highest top bound of confidence interval of ELO. And play a game between them.
Also there should be some "borning mechanism" which should
1. Randomly select two species; the probability of choose should depends on ELO (higher ELO — higher probability)
2. Produce the "child" using some random recombination technique and random "mutation".
And also, of course, we should have some "garbage collector" that will remove species with lowest top confidence interval bound.
This framework should be distributed, of course
I think it's really a good idea to create such a common framework which will provide tuning ability for every chess programmer. And of course it will be really interesting to watch the results
Let me to insert one more coin in this theme)
I think, we should use such a principles in tuning:
1. Use nodes limit per move instead of time limit (more stability)
2. Use fixed set of starting positions (I suppose huge, but strictly predefined) (more stability)
Let's assume that we have several "species" with different current ELO value and number of games played.
At every next step we should find a pair of "species" with the highest top bound of confidence interval of ELO. And play a game between them.
Also there should be some "borning mechanism" which should
1. Randomly select two species; the probability of choose should depends on ELO (higher ELO — higher probability)
2. Produce the "child" using some random recombination technique and random "mutation".
And also, of course, we should have some "garbage collector" that will remove species with lowest top confidence interval bound.
This framework should be distributed, of course
I think it's really a good idea to create such a common framework which will provide tuning ability for every chess programmer. And of course it will be really interesting to watch the results
The Force Be With You!
-
- Posts: 1822
- Joined: Thu Mar 09, 2006 11:54 pm
- Location: The Netherlands
Re: Tuning again
Hi Ed,Rebel wrote:Eval tuning I strictly do at fixed depth. I don't want external factors like time control or permanent brain to interfere. Enough volume will flatten all the horizon effects eventually, both sides.mcostalba wrote:Running games at fixed depth (especially so low like 8 plies) has some drawback,
Downloaded...running in a GUI like Arena has even more drawbacks, I'd suggest a command line tournament manager like cutechess-cli and run on time.
I like Arena because it supports nodes-matches. IMO a better way to test search related changes than on time.
All my engines were in assembler. I just can't get used to these brackets.BTW your C is very assemblish, lovely stuff, really, no joking: it has a kind of vintage fashion.
Things like that drives me crazyCode: Select all
{ { { { } } } }
In itself running fixed depth matches is not a bad idea. However it tunes a lot better if you get through tactical barrier. That barrier is far above 8 ply.
Just go tune at something like 1 minute entire game and 0.1 second increment.
The idea is that after an engine gets 'better' and more 'well tuned' that you also start to search deeper because of the improvements, which scales up th experiment.
Soon you'll move to 5 minutes a game and so on.
If your engine is capable of running as a winboard engine you'll need to use other tools. Most Gui's simply aren't stable enough to play that many games nor fast enough, as they need to update the graphics and do all sort of central locked i/o.
A core or 30 is no luxury to do stuff like this.
Don't believe by the way that just playing games is the holy grail, they do more than just play games for parameter tuning.
Vincent