Training data

Discussion of chess software programming and technical issues.

Moderators: hgm, Dann Corbit, Harvey Williamson

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
User avatar
Desperado
Posts: 782
Joined: Mon Dec 15, 2008 10:45 am

Training data

Post by Desperado » Tue Jan 12, 2021 1:54 pm

Due to my current experiences, I realized again how crucial it is for the tuning is to have useful data.
I have read some posts about this in the past. Nevertheless, I would like to discuss a handful of questions on this topic here.

1. source of the data

1a. A quick way to get data is to use public party collections.
The existing quality of the data has to be considered, and today engine databases like CCRL are a super resource.
Grandmaster games or in general games where humans are involved, can have their own charm especially
when it comes to stylistic issues of the parameters. In short, the huge advantage is that the data is immediately available.

1b. Of course, with an engine pool you can generate appropriate games yourself, on a level as you like it for your tuning.
This costs more time compared to 1a, but the quality level can be can be determined by yourself.

1c. As in the just mentioned step, you generate data from games, in which your own engine is involved.
This again costs more time than the first step, but in my opinion offers the most interesting possibilities.
The level and the quality of the generated data can be determined by the user and perhaps the biggest advantage
can be that the data correlates ideally with the "skills" of the own engine.


2. selection of positions

It is self-explanatory that parameters that are to be tuned are also available in the training data.
If we now start from a general tuning of the evaluation function, the difficulty is to select the data basis
so that a suitable weighting is available. "Fitting" is the challenge here.

In addition, the role of quiet positions also plays a crucial role. Although every experienced engine developer
knows exactly what is meant by it, it is not uniformly defined. Mostly, after using the Quiescense Search,
a position is considered to be quiet (evaluated).

The last state, which is known to me, that for a position a main variant is determined and the leaf of the main variant is
is stored as a position. I would describe this as a strong position, that a quiet search generated the evaluation and the
value has passed through the tree to the rootnode. What is the real background or idea behind this?

What other techniques are used to select quiet positions, maybe static ones like capture moves, checks, SEE or others?

3. quiet and balanced positions

This difference which is rarely or not at all considered has always plagued me.

If I have a static score with, say, a value of 10 and the search returns a value of 20, it says nothing about the tactical situation.
The tactic may end up in a balance.

The reason why a quiet search returns a quietvalue is the result from a position that has no cature move (check).
In a search, however, this circumstance is blurred, because positions with a score > beta are not checked at all.

What kind of positions do we need to get good training data, balance or quiet positions.
I think quiet positions as they occur in a real search, are not very often available in a game database.
For me this would be one of the main reasons why the PV leaf used as training position could be more suitable.


As always, I look forward to any feedback.

User avatar
maksimKorzh
Posts: 630
Joined: Sat Sep 08, 2018 3:37 pm
Location: Ukraine
Full name: Maksim Korzh
Contact:

Re: Training data

Post by maksimKorzh » Tue Jan 12, 2021 2:27 pm

Desperado wrote:
Tue Jan 12, 2021 1:54 pm
Due to my current experiences, I realized again how crucial it is for the tuning is to have useful data.
I have read some posts about this in the past. Nevertheless, I would like to discuss a handful of questions on this topic here.

1. source of the data

1a. A quick way to get data is to use public party collections.
The existing quality of the data has to be considered, and today engine databases like CCRL are a super resource.
Grandmaster games or in general games where humans are involved, can have their own charm especially
when it comes to stylistic issues of the parameters. In short, the huge advantage is that the data is immediately available.

1b. Of course, with an engine pool you can generate appropriate games yourself, on a level as you like it for your tuning.
This costs more time compared to 1a, but the quality level can be can be determined by yourself.

1c. As in the just mentioned step, you generate data from games, in which your own engine is involved.
This again costs more time than the first step, but in my opinion offers the most interesting possibilities.
The level and the quality of the generated data can be determined by the user and perhaps the biggest advantage
can be that the data correlates ideally with the "skills" of the own engine.


2. selection of positions

It is self-explanatory that parameters that are to be tuned are also available in the training data.
If we now start from a general tuning of the evaluation function, the difficulty is to select the data basis
so that a suitable weighting is available. "Fitting" is the challenge here.

In addition, the role of quiet positions also plays a crucial role. Although every experienced engine developer
knows exactly what is meant by it, it is not uniformly defined. Mostly, after using the Quiescense Search,
a position is considered to be quiet (evaluated).

The last state, which is known to me, that for a position a main variant is determined and the leaf of the main variant is
is stored as a position. I would describe this as a strong position, that a quiet search generated the evaluation and the
value has passed through the tree to the rootnode. What is the real background or idea behind this?

What other techniques are used to select quiet positions, maybe static ones like capture moves, checks, SEE or others?

3. quiet and balanced positions

This difference which is rarely or not at all considered has always plagued me.

If I have a static score with, say, a value of 10 and the search returns a value of 20, it says nothing about the tactical situation.
The tactic may end up in a balance.

The reason why a quiet search returns a quietvalue is the result from a position that has no cature move (check).
In a search, however, this circumstance is blurred, because positions with a score > beta are not checked at all.

What kind of positions do we need to get good training data, balance or quiet positions.
I think quiet positions as they occur in a real search, are not very often available in a game database.
For me this would be one of the main reasons why the PV leaf used as training position could be more suitable.


As always, I look forward to any feedback.
Did you have a look at traning data I've extracted from gm2600.pgn?
http://talkchess.com/forum3/viewtopic.php?f=7&t=76251
Wukong Xiangqi (Chinese chess engine + apps to embed into 3rd party websites):
https://github.com/maksimKorzh/wukong-xiangqi

Chess programming YouTube channel:
https://www.youtube.com/channel/UCB9-pr ... KKqDgXhsMQ

Pio
Posts: 264
Joined: Sat Feb 25, 2012 9:42 pm
Location: Stockholm
Contact:

Re: Training data

Post by Pio » Tue Jan 12, 2021 2:45 pm

Desperado wrote:
Tue Jan 12, 2021 1:54 pm
Due to my current experiences, I realized again how crucial it is for the tuning is to have useful data.
I have read some posts about this in the past. Nevertheless, I would like to discuss a handful of questions on this topic here.

1. source of the data

1a. A quick way to get data is to use public party collections.
The existing quality of the data has to be considered, and today engine databases like CCRL are a super resource.
Grandmaster games or in general games where humans are involved, can have their own charm especially
when it comes to stylistic issues of the parameters. In short, the huge advantage is that the data is immediately available.

1b. Of course, with an engine pool you can generate appropriate games yourself, on a level as you like it for your tuning.
This costs more time compared to 1a, but the quality level can be can be determined by yourself.

1c. As in the just mentioned step, you generate data from games, in which your own engine is involved.
This again costs more time than the first step, but in my opinion offers the most interesting possibilities.
The level and the quality of the generated data can be determined by the user and perhaps the biggest advantage
can be that the data correlates ideally with the "skills" of the own engine.


2. selection of positions

It is self-explanatory that parameters that are to be tuned are also available in the training data.
If we now start from a general tuning of the evaluation function, the difficulty is to select the data basis
so that a suitable weighting is available. "Fitting" is the challenge here.

In addition, the role of quiet positions also plays a crucial role. Although every experienced engine developer
knows exactly what is meant by it, it is not uniformly defined. Mostly, after using the Quiescense Search,
a position is considered to be quiet (evaluated).

The last state, which is known to me, that for a position a main variant is determined and the leaf of the main variant is
is stored as a position. I would describe this as a strong position, that a quiet search generated the evaluation and the
value has passed through the tree to the rootnode. What is the real background or idea behind this?

What other techniques are used to select quiet positions, maybe static ones like capture moves, checks, SEE or others?

3. quiet and balanced positions

This difference which is rarely or not at all considered has always plagued me.

If I have a static score with, say, a value of 10 and the search returns a value of 20, it says nothing about the tactical situation.
The tactic may end up in a balance.

The reason why a quiet search returns a quietvalue is the result from a position that has no cature move (check).
In a search, however, this circumstance is blurred, because positions with a score > beta are not checked at all.

What kind of positions do we need to get good training data, balance or quiet positions.
I think quiet positions as they occur in a real search, are not very often available in a game database.
For me this would be one of the main reasons why the PV leaf used as training position could be more suitable.


As always, I look forward to any feedback.
I think the main thing you should ask yourself is what type of positions is my engine evaluating. For example, If your engine doesn’t statically evaluate when you are in check you shouldn’t include that in your training data. The same goes for non static evaluations since the result of those won’t propagate back to your root. That is why it is very smart to use the leaf position that will propagate back the value to the root since they are the most significant positions. However when you are tuning your evaluation function it will change the leaf position that propagates back to the root. So I guess the best method is to generate your own training data and use the leaf position that propagates back to root as your training data. You will have to regenerate the data after you have trained your eval so that the leaf positions will match your new eval. The same goes for the root positions. After a time your engine will evolve and play other openings and other type of chess so you should update your root positions as well.

Ferdy
Posts: 4527
Joined: Sun Aug 10, 2008 1:15 pm
Location: Philippines

Re: Training data

Post by Ferdy » Tue Jan 12, 2021 4:16 pm

Training data needs to be of low quality at first. The positions should be bad that your current evaluation weights can see the difference of how bad it is, that way it can adjust its weight. Then a much improve training data but still bad so that the engine has the opportunity to change the weights. Going forward you better just generate a training data from the pv leaf as you mentioned, because the leaf positions are almost always of low quality compared to the root. We can be sure then that the engine has something to work on its weight.

Generating/saving positions from engines that are weaker than your engine is also good. Your engine can probably see the a better move/score so that it can still adjust its weight.

Another way is to generate training data from your self-play games (this is free as usually after training you will test the engine of how strong it is after training) but select those games where your engine loses even better find the bad positions only and include it in the training data for next training.

User avatar
Desperado
Posts: 782
Joined: Mon Dec 15, 2008 10:45 am

Re: Training data

Post by Desperado » Tue Jan 12, 2021 4:35 pm

maksimKorzh wrote:
Tue Jan 12, 2021 2:27 pm
Desperado wrote:
Tue Jan 12, 2021 1:54 pm
Due to my current experiences, I realized again how crucial it is for the tuning is to have useful data.
I have read some posts about this in the past. Nevertheless, I would like to discuss a handful of questions on this topic here.

1. source of the data

1a. A quick way to get data is to use public party collections.
The existing quality of the data has to be considered, and today engine databases like CCRL are a super resource.
Grandmaster games or in general games where humans are involved, can have their own charm especially
when it comes to stylistic issues of the parameters. In short, the huge advantage is that the data is immediately available.

1b. Of course, with an engine pool you can generate appropriate games yourself, on a level as you like it for your tuning.
This costs more time compared to 1a, but the quality level can be can be determined by yourself.

1c. As in the just mentioned step, you generate data from games, in which your own engine is involved.
This again costs more time than the first step, but in my opinion offers the most interesting possibilities.
The level and the quality of the generated data can be determined by the user and perhaps the biggest advantage
can be that the data correlates ideally with the "skills" of the own engine.


2. selection of positions

It is self-explanatory that parameters that are to be tuned are also available in the training data.
If we now start from a general tuning of the evaluation function, the difficulty is to select the data basis
so that a suitable weighting is available. "Fitting" is the challenge here.

In addition, the role of quiet positions also plays a crucial role. Although every experienced engine developer
knows exactly what is meant by it, it is not uniformly defined. Mostly, after using the Quiescense Search,
a position is considered to be quiet (evaluated).

The last state, which is known to me, that for a position a main variant is determined and the leaf of the main variant is
is stored as a position. I would describe this as a strong position, that a quiet search generated the evaluation and the
value has passed through the tree to the rootnode. What is the real background or idea behind this?

What other techniques are used to select quiet positions, maybe static ones like capture moves, checks, SEE or others?

3. quiet and balanced positions

This difference which is rarely or not at all considered has always plagued me.

If I have a static score with, say, a value of 10 and the search returns a value of 20, it says nothing about the tactical situation.
The tactic may end up in a balance.

The reason why a quiet search returns a quietvalue is the result from a position that has no cature move (check).
In a search, however, this circumstance is blurred, because positions with a score > beta are not checked at all.

What kind of positions do we need to get good training data, balance or quiet positions.
I think quiet positions as they occur in a real search, are not very often available in a game database.
For me this would be one of the main reasons why the PV leaf used as training position could be more suitable.


As always, I look forward to any feedback.
Did you have a look at traning data I've extracted from gm2600.pgn?
http://talkchess.com/forum3/viewtopic.php?f=7&t=76251
Hello Maksim, so far i did not have a closer look at it until now.

Getting positions from high rated players or engines is not a problem and i have more thant 100M positions available.
My interest is about which ideas are involved. This is a line of your code which relates to my post.

Code: Select all


if (self.evaluate() == self.quiescence(-50000, 50000)) and move_num > 5:

Why did you choose positions where the static evaluation equals the result of a quiet position ?
(this is my interpretation of your code snippet).

Why did you not use something like abs(staticEval - quiesEval) < 40 or any other formular, what is your idea?
Or why don't you pick positions where staticeval < search_ply1_score, the idea would be that staticeval is a lowerbound score at least?
Does your code express that you want to use balanced or quiet positions ?, that's not the same for me.

You see, it is more important for me to understand the idea, the reasoning. To implement the idea finally is easy, most of the time.

Thanks for your feedback.

Pio
Posts: 264
Joined: Sat Feb 25, 2012 9:42 pm
Location: Stockholm
Contact:

Re: Training data

Post by Pio » Tue Jan 12, 2021 4:39 pm

Ferdy wrote:
Tue Jan 12, 2021 4:16 pm
Training data needs to be of low quality at first. The positions should be bad that your current evaluation weights can see the difference of how bad it is, that way it can adjust its weight. Then a much improve training data but still bad so that the engine has the opportunity to change the weights. Going forward you better just generate a training data from the pv leaf as you mentioned, because the leaf positions are almost always of low quality compared to the root. We can be sure then that the engine has something to work on its weight.

Generating/saving positions from engines that are weaker than your engine is also good. Your engine can probably see the a better move/score so that it can still adjust its weight.

Another way is to generate training data from your self-play games (this is free as usually after training you will test the engine of how strong it is after training) but select those games where your engine loses even better find the bad positions only and include it in the training data for next training.
In the beginning the training data will be of low quality if you generate the data yourself 😀 but I don’t think it should be a goal by itself. I think it is always much better to train on the data your own engine encounters either in self play or against other opponents since it will better represent the positions your engine will encounter. I believe the training will be much better if you weigh the positions close to the end a lot more since you have much higher correlation between win/loss/draw and the positions evaluation in those cases. The more mature the engine gets the more you can lower the weights closer to the end so they will be more evenly distributed since the engine at that point will have a much higher correlation between position score and game outcome.

User avatar
maksimKorzh
Posts: 630
Joined: Sat Sep 08, 2018 3:37 pm
Location: Ukraine
Full name: Maksim Korzh
Contact:

Re: Training data

Post by maksimKorzh » Tue Jan 12, 2021 5:09 pm

Desperado wrote:
Tue Jan 12, 2021 4:35 pm
maksimKorzh wrote:
Tue Jan 12, 2021 2:27 pm
Desperado wrote:
Tue Jan 12, 2021 1:54 pm
Due to my current experiences, I realized again how crucial it is for the tuning is to have useful data.
I have read some posts about this in the past. Nevertheless, I would like to discuss a handful of questions on this topic here.

1. source of the data

1a. A quick way to get data is to use public party collections.
The existing quality of the data has to be considered, and today engine databases like CCRL are a super resource.
Grandmaster games or in general games where humans are involved, can have their own charm especially
when it comes to stylistic issues of the parameters. In short, the huge advantage is that the data is immediately available.

1b. Of course, with an engine pool you can generate appropriate games yourself, on a level as you like it for your tuning.
This costs more time compared to 1a, but the quality level can be can be determined by yourself.

1c. As in the just mentioned step, you generate data from games, in which your own engine is involved.
This again costs more time than the first step, but in my opinion offers the most interesting possibilities.
The level and the quality of the generated data can be determined by the user and perhaps the biggest advantage
can be that the data correlates ideally with the "skills" of the own engine.


2. selection of positions

It is self-explanatory that parameters that are to be tuned are also available in the training data.

EDIT: I have around 0.13 MSE with 30 000 positions
What MSE do you have and what is corresponding number of positions?
How MSE would change if calculated on 1000000 positions compared to 30000?
If we now start from a general tuning of the evaluation function, the difficulty is to select the data basis
so that a suitable weighting is available. "Fitting" is the challenge here.

In addition, the role of quiet positions also plays a crucial role. Although every experienced engine developer
knows exactly what is meant by it, it is not uniformly defined. Mostly, after using the Quiescense Search,
a position is considered to be quiet (evaluated).

The last state, which is known to me, that for a position a main variant is determined and the leaf of the main variant is
is stored as a position. I would describe this as a strong position, that a quiet search generated the evaluation and the
value has passed through the tree to the rootnode. What is the real background or idea behind this?

What other techniques are used to select quiet positions, maybe static ones like capture moves, checks, SEE or others?

3. quiet and balanced positions

This difference which is rarely or not at all considered has always plagued me.

If I have a static score with, say, a value of 10 and the search returns a value of 20, it says nothing about the tactical situation.
The tactic may end up in a balance.

The reason why a quiet search returns a quietvalue is the result from a position that has no cature move (check).
In a search, however, this circumstance is blurred, because positions with a score > beta are not checked at all.

What kind of positions do we need to get good training data, balance or quiet positions.
I think quiet positions as they occur in a real search, are not very often available in a game database.
For me this would be one of the main reasons why the PV leaf used as training position could be more suitable.


As always, I look forward to any feedback.
Did you have a look at traning data I've extracted from gm2600.pgn?
http://talkchess.com/forum3/viewtopic.php?f=7&t=76251
Hello Maksim, so far i did not have a closer look at it until now.

Getting positions from high rated players or engines is not a problem and i have more thant 100M positions available.
My interest is about which ideas are involved. This is a line of your code which relates to my post.

Code: Select all


if (self.evaluate() == self.quiescence(-50000, 50000)) and move_num > 5:

Why did you choose positions where the static evaluation equals the result of a quiet position ?
(this is my interpretation of your code snippet).

Why did you not use something like abs(staticEval - quiesEval) < 40 or any other formular, what is your idea?
Or why don't you pick positions where staticeval < search_ply1_score, the idea would be that staticeval is a lowerbound score at least?
Does your code express that you want to use balanced or quiet positions ?, that's not the same for me.

You see, it is more important for me to understand the idea, the reasoning. To implement the idea finally is easy, most of the time.

Thanks for your feedback.
re: Why did you choose positions where the static evaluation equals the result of a quiet position ?
(this is my interpretation of your code snippet).

Absolutely correct interpretation.
I did it because in this case I can safely use eval() instead of quiescence() and getting exactly the same mean square error,
bearing in mind that eval is faster than quiescence this makes sense in terms of saving time on calculating MSE.

re: Why did you not use something like abs(staticEval - quiesEval) < 40 or any other formular, what is your idea?
Or why don't you pick positions where staticeval < search_ply1_score, the idea would be that staticeval is a lowerbound score at least?

Well most likely due to my natural dumbness))). It's an interesting idea to try)

re: Does your code express that you want to use balanced or quiet positions ?, that's not the same for me.

Definitely. Despite the fact your ideas arevery interesting and most likely would lead to better results I still prefer my way
due to the following reasons:
1. Calculating MSE for 30K positions already takes several seconds (I know my implementation sucks)
2. I use only 1000 positions for actual Texel's tuning with my +-1 to each eval param

By having positions with condition eval() = quiscence() I ensure (well at least I want to think so) that the similarity of the
positions is much higher than in case of using other formulas, so I kind of try to just get the "general positional consideration"
dropping tactical opportunities completely - I mean positions with not exact static score absorb some tactical influence (at least this is so in my head)
so the eventual PST values would be slightly reflecting that.

So generally I'm trying to make the scores as plain as possible.

Fun fact: the best results I got so far by tuning "simplified evaluation" values from CPW were obtained the following way:
1. Duplicate endgame scores so they match opening scores for both PST and material weights
2. Change pawn PST values

When I was doing just by hands my tapered eval was always much worse to non tapered with only king PST distinguishing game phases.
BUT when I then started using calculating mean square error as the measure to evaluate my "positional ideas" reflected in PST scores
it helped to at least make new tapered eval with no bigger MSE compared to pure "simplified eval"
Then I started to add asymmetric values, e.g. pawn 7th rank - all 50 but b7 = 170 and c7 = 90, so it's like 0, 170, 90, 50, 50, 50, 50, 0
It didn't only minimize the MSE but also improved play.
Then I still did "smoothen" the values by running exactly 2 loops over all params with +-1 with only 1000! positions to calculate MSE
More that 2 loops always resulted visibly worth play but after 2 loops only it played really stronger.

I've managed to obtain 70 Elo stronger version of "simplified eval" + tapered eval, stronger than bare "simplified eval" values from CPW.
I also tried the same method with self played games (500 games) but mean square error was much bigger and same tuning resulted much
worse play.

Assumption:
My way of extracting positions IMO resulted having very very similar positions, so 1 000 000 won't really get less MSE compared to 30 000.
I didn't test it though for 1000000 would take ages with my weird implementation.
I'm trying to prove that with very quite positions it's enough to hand tune asymmetric PST values and use MSE as a measure of correctness
of particular values/ideas then just slightly "smoothen"/randomize PSTs and that's it.

Not sure how far that would lead - I've revealed horrible bugs in my TT so before fixing them I can't run tests on eval)
Wukong Xiangqi (Chinese chess engine + apps to embed into 3rd party websites):
https://github.com/maksimKorzh/wukong-xiangqi

Chess programming YouTube channel:
https://www.youtube.com/channel/UCB9-pr ... KKqDgXhsMQ

User avatar
Desperado
Posts: 782
Joined: Mon Dec 15, 2008 10:45 am

Re: Training data

Post by Desperado » Tue Jan 12, 2021 5:22 pm

Pio wrote:
Tue Jan 12, 2021 2:45 pm
Desperado wrote:
Tue Jan 12, 2021 1:54 pm
Due to my current experiences, I realized again how crucial it is for the tuning is to have useful data.
I have read some posts about this in the past. Nevertheless, I would like to discuss a handful of questions on this topic here.

1. source of the data

1a. A quick way to get data is to use public party collections.
The existing quality of the data has to be considered, and today engine databases like CCRL are a super resource.
Grandmaster games or in general games where humans are involved, can have their own charm especially
when it comes to stylistic issues of the parameters. In short, the huge advantage is that the data is immediately available.

1b. Of course, with an engine pool you can generate appropriate games yourself, on a level as you like it for your tuning.
This costs more time compared to 1a, but the quality level can be can be determined by yourself.

1c. As in the just mentioned step, you generate data from games, in which your own engine is involved.
This again costs more time than the first step, but in my opinion offers the most interesting possibilities.
The level and the quality of the generated data can be determined by the user and perhaps the biggest advantage
can be that the data correlates ideally with the "skills" of the own engine.


2. selection of positions

It is self-explanatory that parameters that are to be tuned are also available in the training data.
If we now start from a general tuning of the evaluation function, the difficulty is to select the data basis
so that a suitable weighting is available. "Fitting" is the challenge here.

In addition, the role of quiet positions also plays a crucial role. Although every experienced engine developer
knows exactly what is meant by it, it is not uniformly defined. Mostly, after using the Quiescense Search,
a position is considered to be quiet (evaluated).

The last state, which is known to me, that for a position a main variant is determined and the leaf of the main variant is
is stored as a position. I would describe this as a strong position, that a quiet search generated the evaluation and the
value has passed through the tree to the rootnode. What is the real background or idea behind this?

What other techniques are used to select quiet positions, maybe static ones like capture moves, checks, SEE or others?

3. quiet and balanced positions

This difference which is rarely or not at all considered has always plagued me.

If I have a static score with, say, a value of 10 and the search returns a value of 20, it says nothing about the tactical situation.
The tactic may end up in a balance.

The reason why a quiet search returns a quietvalue is the result from a position that has no cature move (check).
In a search, however, this circumstance is blurred, because positions with a score > beta are not checked at all.

What kind of positions do we need to get good training data, balance or quiet positions.
I think quiet positions as they occur in a real search, are not very often available in a game database.
For me this would be one of the main reasons why the PV leaf used as training position could be more suitable.


As always, I look forward to any feedback.
I think the main thing you should ask yourself is what type of positions is my engine evaluating. For example, If your engine doesn’t statically evaluate when you are in check you shouldn’t include that in your training data. The same goes for non static evaluations since the result of those won’t propagate back to your root. That is why it is very smart to use the leaf position that will propagate back the value to the root since they are the most significant positions. However when you are tuning your evaluation function it will change the leaf position that propagates back to the root. So I guess the best method is to generate your own training data and use the leaf position that propagates back to root as your training data. You will have to regenerate the data after you have trained your eval so that the leaf positions will match your new eval. The same goes for the root positions. After a time your engine will evolve and play other openings and other type of chess so you should update your root positions as well.
Hi Pio,

Well, of course i can start with rootnodes taken from any database, perform a search on them and write the corresponding leaf positions of the pv to a file. Updating the data is indeed an important point when the engine becomes stronger.

This process can be improved when the engine generates this data while playing. A current game presents the "new" root positions while the search provides the leaf node as result of the pv. It always can be done with the latest version. I would only take a predefined number of positons for any game.

Basically your answer includes what feels right for me too, the engine needs to be involved already when creating the training data.
Beside the effect for the tuner it gives you some independcy on the ressources too.

User avatar
Desperado
Posts: 782
Joined: Mon Dec 15, 2008 10:45 am

Re: Training data

Post by Desperado » Tue Jan 12, 2021 5:45 pm

Hi Maksim,
re: Why did you choose positions where the static evaluation equals the result of a quiet position ?
(this is my interpretation of your code snippet).

Absolutely correct interpretation.
I did it because in this case I can safely use eval() instead of quiescence() and getting exactly the same mean square error,
bearing in mind that eval is faster than quiescence this makes sense in terms of saving time on calculating MSE.
Interesting!

But i think this is pretty random. You can play dice to pick a position then too. I believe that, because as soon
you updated your eval you loose this property on your dataset. The result of the qs might change for your next generation,
either you generate a new one that has this property. Of course that is just my first though. It is important that you follow an idea.

But there is some intuition to exploit this formular, because you should be able to measure the distance of the scores before and after.
I guess your qs is deterministic, when you repeat it you will get the same score again. (obviously not when you changed your evaluation scores).
I have to think if that can be used as some KPI. (I am poking in the fog)

User avatar
Desperado
Posts: 782
Joined: Mon Dec 15, 2008 10:45 am

Re: Training data

Post by Desperado » Tue Jan 12, 2021 6:17 pm

Ferdy wrote:
Tue Jan 12, 2021 4:16 pm
Training data needs to be of low quality at first. The positions should be bad that your current evaluation weights can see the difference of how bad it is, that way it can adjust its weight. Then a much improve training data but still bad so that the engine has the opportunity to change the weights. Going forward you better just generate a training data from the pv leaf as you mentioned, because the leaf positions are almost always of low quality compared to the root. We can be sure then that the engine has something to work on its weight.

Generating/saving positions from engines that are weaker than your engine is also good. Your engine can probably see the a better move/score so that it can still adjust its weight.

Another way is to generate training data from your self-play games (this is free as usually after training you will test the engine of how strong it is after training) but select those games where your engine loses even better find the bad positions only and include it in the training data for next training.
Hi Ferdy,

it looks like you prefer the quality of a position as select criterium. As you describe it there are two kinds of qualities.
The search effort that was used for a node or the level of the chess entitiy that produces that node.

Maybe it will change with my experience on this topic, but for now i think it is more important that a position includes
many evaluation features, correlates stronger with it than to be an easy or difficult task.

Anyway an intersting idea, especially because the elo given in the pgn is already an indicator for that property.

Post Reply