RFC: There Can Only Be One (Eval Optimization)

JoAnnP38 · Post by **JoAnnP38** » Tue Dec 13, 2022 4:47 am

I am currently designing my data structures to support the makeshift genetic algorithm I am going to use to optimize the weights used for my evaluation function. Currently, I have identified 860 weights that need to be optimized. If I were smarter there would probably be more, but if I were even smarter there may even be less.

Instead of a fitness function which most GA use to score each chromosome, I am going to use a tournament structure to select chromosomes for both survival and to reproduce. I'll need to automate this process because I really want to set a PC in the corner and let it run until it comes up with an answer. To facilitate this, I'll need to develop a specialized administration tool that will manage the tournaments and the rest of the aspects of GA. I can use UCI to communicate with two versions of the same engine but pass the ID of the unique set of weights they will use for their evaluations for each game and then manage the game until completion. Each engine will load the weights associated with that ID from a shared data store. Hopefully, it can run multiple games simultaneously without bogging down the CPU too much. Loosely, here is how I think the process will go:

Create initial population or gene pool (64 total chromosomes)
1. Create 1 exemplar (set to 860 reasonable guesses for weights)
2. Create 63 random chromosomes (random weights)
Create a tournament for all 64 current chromosomes (if scores are available then seed pairings)
1. Each individual match between two chromosomes will consist of two games allowing both to play each color.
2. Score each game as follows: 2 points for win, 1 point each for draw and 0 points for loss
3. If match results in tie, choose a winner randomly
4. All winners advance to next round (repeat a-d)
5. Starting with round of 16, each pairing will also crossover to produce two children (mutations possible)
6. Tournament will continue until final pairing results in winner
7. If winner is not the exemplar and is only different from prior generation's winner by less than 1%, then exit evolution and declare ultimate winner
8. Stop evolution if 100 or more generations have been processed
9. Top 32 chromosomes automatically advance to the next generation
Create sudden death matches between 32 initial losers/32 new children (from 2 above)
1. Only one round is played and 32 losers are kicked out of the gene pool (whether they be new children or previous chromosomes)
2. Any chromosome that has lived longer than 10 generations (except for final 16 from 2) will be replaced by a new, randomly generated chromosome
3. Return to step 2 and repeat.

Most all of the above I'm pulling out of thin air with only a very modest amount of academic knowledge about GA. I don't have a gut feeling about how fast values will converge and whether this tournament environment is the best way to produce convergence. However, it will be interesting to watch. I will be storing the PGN for each game and the heritage of each chromosome so I can check in to see how things are progressing.

If you have any experience (or even no experience) with genetic algorithms I would welcome your feedback. If everything runs as hoped each generation will consist of 256 games in the tournament followed by 64 games in sudden death. At 5-10 seconds allocated per games that will result in 25 - 55 minutes per generation. If it runs 100 generations without finding convergence that could take 100 hours of processing.

Mike Sherwin · Post by **Mike Sherwin** » Tue Dec 13, 2022 5:39 pm

In Darwinian evolution theory there are two types. There is evolution because the environment changes. There is evolution because a creature changes. Evolution because of environmental changes is much more common and successful because the less suited for the new environment die off. A creature changing and evolving is much more rare because some random mutation is more likely to fail than to succeed. If it ain't broke don't fix it.

How do we apply this to chess. The environment is not a constant. There are open positions and closed positions. An optimal set of weights for open positions may not be optimal for closed positions. If a set of values are created as a compromise between open and closed positions then that set will not be optimal for either open or closed positions. Randomness in nature is very slow taking tens of thousands of years at best and millions at worst. And randomness may never hit on optimal. So now we need a way to accomplish millions of years of evolution in a few months at best. That would mean systematically changing one weight at a time and keep the best value. After all 860 weights have been processed (because changing weights changes the overall synergy) then rinse and repeat numerous times until no further improvement can be found.

emadsen · Post by **emadsen** » Tue Dec 13, 2022 6:10 pm

JoAnnP38 wrote: ↑Tue Dec 13, 2022 4:47 am If you have any experience (or even no experience) with genetic algorithms I would welcome your feedback.

I don't have any experience using genetic algorithms to tune my chess engine's evaluation. However, you may want to read Thomas Petzke's blog posts regarding Population Based Incremental Learning, a genetic algorithm. Either before or after your experiment. Thomas is the author of iCE, a strong chess engine (previously known as mACE).

I remember enjoying reading his blog posts about genetic algorithms. Almost ten years ago.

JoAnnP38 · Post by **JoAnnP38** » Tue Dec 13, 2022 6:25 pm

Mike Sherwin wrote: ↑Tue Dec 13, 2022 5:39 pm How do we apply this to chess. The environment is not a constant. There are open positions and closed positions. An optimal set of weights for open positions may not be optimal for closed positions. If a set of values are created as a compromise between open and closed positions then that set will not be optimal for either open or closed positions. Randomness in nature is very slow taking tens of thousands of years at best and millions at worst. And randomness may never hit on optimal. So now we need a way to accomplish millions of years of evolution in a few months at best. That would mean systematically changing one weight at a time and keep the best value. After all 860 weights have been processed (because changing weights changes the overall synergy) then rinse and repeat numerous times until no further improvement can be found.

I strongly suspect that most of the "qualities" I've chosen to measure, and weight are not completely independent of other values. For instance, mobility certainly factors into the piece square tables as well as qualities such as development or pawn advancement. Even the weight of individual pieces is probably a function of mobility (or at least potential mobility). As such I think trying to optimize one variable at a time may be an endless job of adjusting one parameter only to have it throw previously "optimized" parameters out of kilter. For now, I look at evaluation as more heuristic than formula and I simply want to arrive at a set of weights to produce a winning or strong engine (as we all do.) Even a good local maximum will do for now. Evolution is one thing sure, but there is also the routine nature of reproduction that yields better or worse results just by luck of the draw. And while your son or daughter may grow up to be a star quarterback or midfielder and may go on to have a stellar career in sports that doesn't mean they are destined to change the fate of the entire human race. All I'm looking for is that star quarterback.

JoAnnP38 · Post by **JoAnnP38** » Tue Dec 13, 2022 6:29 pm

emadsen wrote: ↑Tue Dec 13, 2022 6:10 pm I don't have any experience using genetic algorithms to tune my chess engine's evaluation. However, you may want to read Thomas Petzke's blog posts regarding Population Based Incremental Learning, a genetic algorithm. Either before or after your experiment. Thomas is the author of iCE, a strong chess engine (previously known as mACE).

I remember enjoying reading his blog posts about genetic algorithms. Almost ten years ago.

THANK YOU for that link! I know I'm walking territory that has been walked years before, but it has always interested me. And now that I have tons of time on my hands, I really want to experience the discovery first-hand. I will be sure to check out that link/blog. In fact, I'll probably take a look first thing this afternoon when I get back from some errands. I especially want to do it before I've started finalizing a design. Thanks again!

Chessnut1071 · Post by **Chessnut1071** » Wed Dec 14, 2022 5:33 am

JoAnnP38 wrote: ↑Tue Dec 13, 2022 4:47 am I am currently designing my data structures to support the makeshift genetic algorithm I am going to use to optimize the weights used for my evaluation function. Currently, I have identified 860 weights that need to be optimized. If I were smarter there would probably be more, but if I were even smarter there may even be less.

Instead of a fitness function which most GA use to score each chromosome, I am going to use a tournament structure to select chromosomes for both survival and to reproduce. I'll need to automate this process because I really want to set a PC in the corner and let it run until it comes up with an answer. To facilitate this, I'll need to develop a specialized administration tool that will manage the tournaments and the rest of the aspects of GA. I can use UCI to communicate with two versions of the same engine but pass the ID of the unique set of weights they will use for their evaluations for each game and then manage the game until completion. Each engine will load the weights associated with that ID from a shared data store. Hopefully, it can run multiple games simultaneously without bogging down the CPU too much. Loosely, here is how I think the process will go:

Create initial population or gene pool (64 total chromosomes)

Create 1 exemplar (set to 860 reasonable guesses for weights)

Create 63 random chromosomes (random weights)

Create a tournament for all 64 current chromosomes (if scores are available then seed pairings)

Each individual match between two chromosomes will consist of two games allowing both to play each color.

Score each game as follows: 2 points for win, 1 point each for draw and 0 points for loss

If match results in tie, choose a winner randomly

All winners advance to next round (repeat a-d)

Starting with round of 16, each pairing will also crossover to produce two children (mutations possible)

Tournament will continue until final pairing results in winner

If winner is not the exemplar and is only different from prior generation's winner by less than 1%, then exit evolution and declare ultimate winner

Stop evolution if 100 or more generations have been processed

Top 32 chromosomes automatically advance to the next generation

Create sudden death matches between 32 initial losers/32 new children (from 2 above)

Only one round is played and 32 losers are kicked out of the gene pool (whether they be new children or previous chromosomes)

Any chromosome that has lived longer than 10 generations (except for final 16 from 2) will be replaced by a new, randomly generated chromosome

Return to step 2 and repeat.

Most all of the above I'm pulling out of thin air with only a very modest amount of academic knowledge about GA. I don't have a gut feeling about how fast values will converge and whether this tournament environment is the best way to produce convergence. However, it will be interesting to watch. I will be storing the PGN for each game and the heritage of each chromosome so I can check in to see how things are progressing.

If you have any experience (or even no experience) with genetic algorithms I would welcome your feedback. If everything runs as hoped each generation will consist of 256 games in the tournament followed by 64 games in sudden death. At 5-10 seconds allocated per games that will result in 25 - 55 minutes per generation. If it runs 100 generations without finding convergence that could take 100 hours of processing.

Your optimization structure sounds very similar to Frontline Systems evolutionary optimization system which randomly assigns weights to the decision variables. I discovered how effective it was when I gave my class a "traveling salesman problem" with 36 cities and over 900 constraints. I programed a brute force approach for the solution to be certain the solution was optimum. It took over 4 days for my brute force optimization to find the optimum solution on a 2.6GHz, Intel i7. The student, using Frontline's software, got the optimum solution in 33 seconds using the same chip; however, you can never be sure it is the optimum using an evolutionary approach. I think that software is available to the public and they have a version running in Excel.

JoAnnP38 · Post by **JoAnnP38** » Wed Dec 14, 2022 7:23 pm

Chessnut1071 wrote: ↑Wed Dec 14, 2022 5:33 am Your optimization structure sounds very similar to Frontline Systems evolutionary optimization system which randomly assigns weights to the decision variables. I discovered how effective it was when I gave my class a "traveling salesman problem" with 36 cities and over 900 constraints. I programed a brute force approach for the solution to be certain the solution was optimum. It took over 4 days for my brute force optimization to find the optimum solution on a 2.6GHz, Intel i7. The student, using Frontline's software, got the optimum solution in 33 seconds using the same chip; however, you can never be sure it is the optimum using an evolutionary approach. I think that software is available to the public and they have a version running in Excel.

Funny that you mention the traveling salesman problem, because that's how I was introduced to genetic algorithms back in the 80s. I remember programming the GA in LISP on a Sun workstation. Good memories. And as you pointed it out, it always arrived at a solution much faster than the brute force method. We also used the GA on the Knight's Tour problem, and it was equally as effective. The only things I'm doing differently than I did back then was that I'm not trying to encode the solution into a single bit string or use a simple, quick to calculate fitness function. Instead, my chromosome is an array of 860 genes (integral values) and I'm using tournament play to assign relative value to chromosomes. Also, instead of just having one crossover point as we did with bit strings, my crossover function assigns a 50% probability of child #1 getting a gene from parent #1 or #2. And child #2 always gets the other gene:

Code: Select all

	for (int n = 0; n < 860; n++)
	{
	    if (rand.NextDouble() < 0.5d)
	    {
	        child1[n] = parent1[n];
	        child2[n] = parent2[n];
	    }
	    else
	    {
	        child1[n] = parent2[n];
	        child2[n] = parent1[n];
	    }
	}

Not shown above is that when a particular gene is assigned to a child1 there is a 0.01% chance that a mutation will occur, and the gene will be replaced with a new random gene instead. I believe my implementation is still in the realm of genetic algorithms even though some of my implementation is different than the way I learned it decades ago. Where I fear that my implementation my prove futile is that I do not know how fast convergence on a good solution will occur. It could be that tournament play as I have imagined it takes way too long and instead of needing 10s or 100s of generations to arrive a convergence I would need 10,000s or 1,000,000s of generations. But it will be fun to find out all the same.

Chessnut1071 · Post by **Chessnut1071** » Thu Dec 15, 2022 3:12 pm

JoAnnP38 wrote: ↑Wed Dec 14, 2022 7:23 pm
Chessnut1071 wrote: ↑Wed Dec 14, 2022 5:33 am Your optimization structure sounds very similar to Frontline Systems evolutionary optimization system which randomly assigns weights to the decision variables. I discovered how effective it was when I gave my class a "traveling salesman problem" with 36 cities and over 900 constraints. I programed a brute force approach for the solution to be certain the solution was optimum. It took over 4 days for my brute force optimization to find the optimum solution on a 2.6GHz, Intel i7. The student, using Frontline's software, got the optimum solution in 33 seconds using the same chip; however, you can never be sure it is the optimum using an evolutionary approach. I think that software is available to the public and they have a version running in Excel.
Funny that you mention the traveling salesman problem, because that's how I was introduced to genetic algorithms back in the 80s. I remember programming the GA in LISP on a Sun workstation. Good memories. And as you pointed it out, it always arrived at a solution much faster than the brute force method. We also used the GA on the Knight's Tour problem, and it was equally as effective. The only things I'm doing differently than I did back then was that I'm not trying to encode the solution into a single bit string or use a simple, quick to calculate fitness function. Instead, my chromosome is an array of 860 genes (integral values) and I'm using tournament play to assign relative value to chromosomes. Also, instead of just having one crossover point as we did with bit strings, my crossover function assigns a 50% probability of child #1 getting a gene from parent #1 or #2. And child #2 always gets the other gene:
Code: Select all
	for (int n = 0; n < 860; n++)
	{
	    if (rand.NextDouble() < 0.5d)
	    {
	        child1[n] = parent1[n];
	        child2[n] = parent2[n];
	    }
	    else
	    {
	        child1[n] = parent2[n];
	        child2[n] = parent1[n];
	    }
	}
Not shown above is that when a particular gene is assigned to a child1 there is a 0.01% chance that a mutation will occur, and the gene will be replaced with a new random gene instead. I believe my implementation is still in the realm of genetic algorithms even though some of my implementation is different than the way I learned it decades ago. Where I fear that my implementation my prove futile is that I do not know how fast convergence on a good solution will occur. It could be that tournament play as I have imagined it takes way too long and instead of needing 10s or 100s of generations to arrive a convergence I would need 10,000s or 1,000,000s of generations. But it will be fun to find out all the same.

tournament games are most common for tuning your engine; however, did you ever try minimizing the search order distance to the grandmaster move for specific openings and defenses? I had very good luck with it in four defenses against the English opening. Problem is my end game algorithm played like a two-year-old child. Also, I wouldn't worry about speed if you're looking to the future. In less than 10 years quantum computing will take speeds to levels not even imaginable today. The cost unfortunately will be an issue.

brianr · Post by **brianr** » Fri Dec 16, 2022 10:40 am

JoAnnP38 wrote: ↑Tue Dec 13, 2022 4:47 am If you have any experience (or even no experience) with genetic algorithms I would welcome your feedback. If everything runs as hoped each generation will consist of 256 games in the tournament followed by 64 games in sudden death. At 5-10 seconds allocated per games that will result in 25 - 55 minutes per generation. If it runs 100 generations without finding convergence that could take 100 hours of processing.

Among many other tuning/optimization approaches, I tried GA about 10 years ago with my private engine Tinker.
First, I tried doing only piece material values using the approach from here (only reference I can easily find in the ancient code):

Code: Select all

/* ga.c                       */
/* Genetic Algorithm Program  */
/* By Will Lennon             */
/* Started 1-28-95            */
/* Updated 10-4-95             */

Then I tried several hundred parameters after adopting Crafty's personality approach using:

Code: Select all

/*********************************************************************/
/*                                                                   */
/*  FILENAME : ~/CExamples/GAs_1/multistart.c                        */
/*                                                                   */
/*  AUTHOR : K Williams                                              */
/*                                                                   */
/*  LAST UPDATE : 1 / Mar / 1997                                     */
/*                                                                   */
/*  COMMENTS :                                                       */
/*                                                                   */
/*  This is a multi-start (steepest ascent) iterative hill-climbing  */
/*  algorithm as outlined in Michalewicz (pgs 26-28).                */ 
/*                                                                   */
/*  COMPILER : gcc -O -o multistart multistart.c -lm                 */
/*                                                                   */
/*********************************************************************/

Neither one provided anything significant, given the hardware available.
I quickly moved on to Texel-type tuning, before abandoning that for NNs with the advent of Giraffe.
Prior to NNs, all of the many tuning techniques (CLOP, TD, GA, Texel) were somewhat useful for a first pass with semi-reasonable values, but could never improve much on what is now termed HCE (hand crafted evaluation). NNs, of course, are astonishingly good. Still, it was fun to fiddle with the different tuning ideas, so good luck.

lithander · Post by **lithander** » Sun Dec 18, 2022 10:10 pm

Interesting thread!

JoAnnP38 wrote: ↑Wed Dec 14, 2022 7:23 pm It could be that tournament play as I have imagined it takes way too long and instead of needing 10s or 100s of generations to arrive a convergence I would need 10,000s or 1,000,000s of generations. But it will be fun to find out all the same.

I'm really curious about that, too. My gutfeel is that with 860 individual weights it could take much longer than 100 generations and rather large populations because each individual weight has only a small impact on the fitness. But that wouldn't be a problem if the generations could be computed very quickly. Sadly playing matches with a reasonable search-depth always takes a lot of cycles. And so I've been wondering if a different fitness function could be used?

For example what I'm currently doing in Leorik is that I use gradient descent to minimize the mean-squared error of my evaluation function over a set of annotated positions. That converges to a usable set of weights within minutes. And now I'm wondering: What if I used GA instead of gradient descent? As a fitness function the MSE of the eval on a set of positions doesn't sound too bad.

RFC: There Can Only Be One (Eval Optimization)

RFC: There Can Only Be One (Eval Optimization)

Re: RFC: There Can Only Be One (Eval Optimization)

Re: RFC: There Can Only Be One (Eval Optimization)

Re: RFC: There Can Only Be One (Eval Optimization)

Re: RFC: There Can Only Be One (Eval Optimization)

Re: RFC: There Can Only Be One (Eval Optimization)

Re: RFC: There Can Only Be One (Eval Optimization)

Re: RFC: There Can Only Be One (Eval Optimization)

Re: RFC: There Can Only Be One (Eval Optimization)

Re: RFC: There Can Only Be One (Eval Optimization)