Why computing K that minimizes the sigmoid func. value?...

petero2 · Post by **petero2** » Mon Nov 30, 2015 8:18 pm

cdani wrote:Another think I just thought. If I have already won 25 elo in self play, I suppose is better to run a new bunch of 60,000 games with the latest best version, as the new parameter values when tested will behave better correlation with the results of the games, so they will be optimized better.

That might be better. The first time I did this I got an additional +13 elo in hyper-bullet self test. Trying it again later has not helped much though.

cdani wrote:But if this is true, why not use games of a stronger engine like Stockfish? It will be even better.

That could work, but I think the games from the stronger engine should also be hyper-bullet games, so that they contain some mistakes leading to unbalanced positions.

Personally I did not want to improve my engine by using data generated by an engine that I had not created myself, even though I don't think it would have been morally or legally wrong to do so.

mvk · Post by **mvk** » Mon Nov 30, 2015 11:05 pm

petero2 wrote:2. Tactical aspects of a position that can not reasonably be modeled by the evaluation function.

Correct. But there are certain values you can expect from a raw eval or a qsearch that doesn't change much from program to program. Once search gets involved, the error drops, and does so different for each program. Here you can see the result on sqrt(E) for different searches.

I think it can be useful to go from qsearch to 1 or 2 ply searches if you have many parameters with little coverage per parameter (such as piece/square tables with a parameter for each individual square).

The problem I still have with this method is that correlation isn't causation, and there will be selection bias in the input games. For example, a white bishop on h7 might be very bad on average in normal play. But if players know that, they will avoid the bad cases and then the input games only show the cases where it works. The data from such games is then biassed. Adding a search might solve that because you invite a bit of the tradeoff in the optimisation. But in the end, playing games from the new vector is still the best thing, if it weren't so expensive.

cdani · Post by **cdani** » Mon Dec 07, 2015 11:29 pm

So I published the new version of Andscacs with this self tuning done.

The first 4th-6th iterations with the tuning where the best in improvements; I was testing after each iteration. The next after the last good was really bad, so I decided to generate a new bunch of games.

This time I obtained improvements, little ones, until the 9th iteration.

Here I tuned the piece values alone, not present in previous tunings, until the tuning finished. Was a little win. I have yet to try with partial tuning.

Then I added parameters related to passed pawns and I tuned all things related to passed pawns alone, obtaining a little win but only in the 4th first iterations.

Then I tuned some new parameters related with knights that converged quickly and was a little win.

A new general tuning try was bad from the first attempt, so I decided to stop here.

For some of the parameters I use, instead of an static value, a proportional value that is increased/reduced in powers of two and derived values. To tune them I used this function:

Code: Select all

int incrementa_redueix_proporcionalment&#40;int v, int increment&#41; &#123;
	int r, j;
	if &#40;increment == 0&#41;
		return v;
	//64
	j = abs&#40;increment&#41;;
	if &#40;j == 1&#41;
		r = FerPun&#40;PunI&#40;v&#41; >> 5, PunF&#40;v&#41; >> 5&#41;; //2
	else if &#40;j == 2&#41;
		r = FerPun&#40;PunI&#40;v&#41; >> 4, PunF&#40;v&#41; >> 4&#41;; //4
	else if &#40;j == 3&#41;
		r = FerPun&#40;PunI&#40;v&#41; >> 3, PunF&#40;v&#41; >> 3&#41;; //8
	else if &#40;j == 4&#41;
		r = FerPun&#40;PunI&#40;v&#41; >> 3, PunF&#40;v&#41; >> 3&#41; + FerPun&#40;PunI&#40;v&#41; >> 4, PunF&#40;v&#41; >> 4&#41;; //12
	else if &#40;j == 5&#41;
		r = FerPun&#40;PunI&#40;v&#41; >> 2, PunF&#40;v&#41; >> 2&#41;; //16
	else if &#40;j == 6&#41;
		r = FerPun&#40;PunI&#40;v&#41; >> 2, PunF&#40;v&#41; >> 2&#41; + FerPun&#40;PunI&#40;v&#41; >> 4, PunF&#40;v&#41; >> 4&#41;; //20
	else if &#40;j == 7&#41;
		r = FerPun&#40;PunI&#40;v&#41; >> 2, PunF&#40;v&#41; >> 2&#41; + FerPun&#40;PunI&#40;v&#41; >> 3, PunF&#40;v&#41; >> 3&#41;; //24
	else if &#40;j == 8&#41;
		r = FerPun&#40;PunI&#40;v&#41; >> 1, PunF&#40;v&#41; >> 1&#41;; //32
	else if &#40;j == 9&#41;
		r = FerPun&#40;PunI&#40;v&#41; >> 1, PunF&#40;v&#41; >> 1&#41; + FerPun&#40;PunI&#40;v&#41; >> 3, PunF&#40;v&#41; >> 3&#41;; //40
	else if &#40;j == 10&#41;
		r = FerPun&#40;PunI&#40;v&#41; >> 1, PunF&#40;v&#41; >> 1&#41; + FerPun&#40;PunI&#40;v&#41; >> 2, PunF&#40;v&#41; >> 2&#41;; //48
	else if &#40;j == 11&#41;
		r = FerPun&#40;PunI&#40;v&#41; >> 1, PunF&#40;v&#41; >> 1&#41; + FerPun&#40;PunI&#40;v&#41; >> 2, PunF&#40;v&#41; >> 2&#41; + FerPun&#40;PunI&#40;v&#41; >> 3, PunF&#40;v&#41; >> 3&#41;; //56
	else if &#40;j == 12&#41;
		r = v; //64
	return increment > 0 ? v + r &#58; v - r;
&#125;

PunI means mg value, and PunF eg value.

I don't know if someone is using something similar. I have not seen something like this in other engines.

Laskos · Post by **Laskos** » Tue Dec 08, 2015 5:47 pm

cdani wrote:So I published the new version of Andscacs with this self tuning done.

Good tuning method for fooling the Similarity detector. 0.84 versus 0.83 has 50% similarity hit, much below 65%-75% of the successive versions of engines, and below 60% which starts to show positive as derivative.

cdani · Post by **cdani** » Tue Dec 08, 2015 6:03 pm

Laskos wrote:
cdani wrote:So I published the new version of Andscacs with this self tuning done.
Good tuning method for fooling the Similarity detector. 0.84 versus 0.83 has 50% similarity hit, much below 65%-75% of the successive versions of engines, and below 60% which starts to show positive as derivative.

Curious. I never used this tool. But I think your result is very logical, as hand tuning is, well, very manual

Evert · Post by **Evert** » Tue Dec 08, 2015 7:16 pm

cdani wrote:So I published the new version of Andscacs with this self tuning done.

The first 4th-6th iterations with the tuning where the best in improvements; I was testing after each iteration. The next after the last good was really bad, so I decided to generate a new bunch of games.

Beware that finding a drop in strength after an iteration doesn't necessarily mean that it'll keep getting worse: the next iteration might be better again. The landscape unlikely to be some nice smooth surface with a well-defined minimum that you can find easily, sometimes you have to "climb a hill" to find the (locally optimal) minimum.
Did you check whether the residual of the evaluation was still getting better? If you see that a new set of evaluation parameters doesn't improve the residual it may be worth it to reduce the amount by which the parameters are adjusted.

For some of the parameters I use, instead of an static value, a proportional value that is increased/reduced in powers of two and derived values. To tune them I used this function:

Code: Select all

int incrementa_redueix_proporcionalment&#40;int v, int increment&#41; &#123;
	int r, j;
	if &#40;increment == 0&#41;
		return v;
	//64
	j = abs&#40;increment&#41;;
	if &#40;j == 1&#41;
		r = FerPun&#40;PunI&#40;v&#41; >> 5, PunF&#40;v&#41; >> 5&#41;; //2
	else if &#40;j == 2&#41;
		r = FerPun&#40;PunI&#40;v&#41; >> 4, PunF&#40;v&#41; >> 4&#41;; //4
	else if &#40;j == 3&#41;
		r = FerPun&#40;PunI&#40;v&#41; >> 3, PunF&#40;v&#41; >> 3&#41;; //8
	else if &#40;j == 4&#41;
		r = FerPun&#40;PunI&#40;v&#41; >> 3, PunF&#40;v&#41; >> 3&#41; + FerPun&#40;PunI&#40;v&#41; >> 4, PunF&#40;v&#41; >> 4&#41;; //12
	else if &#40;j == 5&#41;
		r = FerPun&#40;PunI&#40;v&#41; >> 2, PunF&#40;v&#41; >> 2&#41;; //16
	else if &#40;j == 6&#41;
		r = FerPun&#40;PunI&#40;v&#41; >> 2, PunF&#40;v&#41; >> 2&#41; + FerPun&#40;PunI&#40;v&#41; >> 4, PunF&#40;v&#41; >> 4&#41;; //20
	else if &#40;j == 7&#41;
		r = FerPun&#40;PunI&#40;v&#41; >> 2, PunF&#40;v&#41; >> 2&#41; + FerPun&#40;PunI&#40;v&#41; >> 3, PunF&#40;v&#41; >> 3&#41;; //24
	else if &#40;j == 8&#41;
		r = FerPun&#40;PunI&#40;v&#41; >> 1, PunF&#40;v&#41; >> 1&#41;; //32
	else if &#40;j == 9&#41;
		r = FerPun&#40;PunI&#40;v&#41; >> 1, PunF&#40;v&#41; >> 1&#41; + FerPun&#40;PunI&#40;v&#41; >> 3, PunF&#40;v&#41; >> 3&#41;; //40
	else if &#40;j == 10&#41;
		r = FerPun&#40;PunI&#40;v&#41; >> 1, PunF&#40;v&#41; >> 1&#41; + FerPun&#40;PunI&#40;v&#41; >> 2, PunF&#40;v&#41; >> 2&#41;; //48
	else if &#40;j == 11&#41;
		r = FerPun&#40;PunI&#40;v&#41; >> 1, PunF&#40;v&#41; >> 1&#41; + FerPun&#40;PunI&#40;v&#41; >> 2, PunF&#40;v&#41; >> 2&#41; + FerPun&#40;PunI&#40;v&#41; >> 3, PunF&#40;v&#41; >> 3&#41;; //56
	else if &#40;j == 12&#41;
		r = v; //64
	return increment > 0 ? v + r &#58; v - r;
&#125;

PunI means mg value, and PunF eg value.

I don't know if someone is using something similar. I have not seen something like this in other engines.

I can't really tell what that code does (the function names are meaningless to me), perhaps if you can explain more clearly what it's supposed to do...?

cdani · Post by **cdani** » Tue Dec 08, 2015 11:12 pm

Evert wrote: Beware that finding a drop in strength after an iteration doesn't necessarily mean that it'll keep getting worse: the next iteration might be better again.

I thought a little about it but I decided to go for the easier path. Probably I will try again with a new group of games. The problem is that a single iteration of the +1000 parameters is like 7-8 hours, even in a powerful 6 core cpu.

Evert wrote: Did you check whether the residual of the evaluation was still getting better?

If you mean the "e" value, yes, it was getting better.

Evert wrote: If you see that a new set of evaluation parameters doesn't improve the residual it may be worth it to reduce the amount by which the parameters are adjusted.

I was using the minimum amount of increase/reduction for every parameter, 1 or -1.

I can't really tell what that code does (the function names are meaningless to me), perhaps if you can explain more clearly what it's supposed to do...?

Sorry, I went to fast.

Some parameters of the evaluation function are relative to others. For example I diminish the base piece square table value of the knight depending if it's outposted or semioutposted. Or for passed pawns I have 5 different ways to increase/reduce their base value depending on various conditions. I have other similar parameters anywhere also.

Those parameters are not fixed values but they modify the base value by increasing/reducing it proportionally to his current value. So with this function I tuned which proportion was the best one to increase/reduce the value.

An example. Suppose the PST value of a knight is (20 mg, 15 eg), but a pawn can sooner or later menace it. The tuned parameter for this case has the value -7, so following the function, the final value will be:
(20 mg, 15 eg) - ((20 >> 2, 15 >> 2) + (20 >> 3, 15 >> 3)) = (13 mg, 11 eg)

If the PST value of the knight was initially negative, this reduction will not take part.

The tuning process can iterate from -12 to 12 ("increment" is the name of this variable) each parameter of this type. The comments like //12, //40 are examples of which value will be added/subtracted if the entered value (v) was 64.

I have yet to try to increase/reduce proportionally separately the mg/eg part.

cdani · Post by **cdani** » Sat Jan 09, 2016 8:10 pm

I'm tuning by hand some of the parameters again, as Texel tuning method, even if I tried to control most changes, has tuned some of them for fast time control.

Was evident when I started to see bad played games in long time controls of the new version. If in general the result is better, but Andscacs has clearly lost a lot of balance.

The good part is because I have worked by hand most of the time I have reference values of the parameters that tend to be very sensitive to time control, so I can go direct to try probably working changes, so for the moment I can regain some strength and equilibrium without much work.

cdani · Post by **cdani** » Sat Jan 09, 2016 8:24 pm

Also when you see CCRL 40/40 and CCRL 40/4 is clear something has gone bad, as 0.84 has more rating (in one CPU for the moment) in 40/4, when the objective is the other way, and not by a little margin but at least 30 elo.

tpetzke · Post by **tpetzke** » Mon Jan 11, 2016 7:54 am

Also when you see CCRL 40/40 and CCRL 40/4 is clear something has gone bad, as 0.84 has more rating (in one CPU for the moment) in 40/4, when the objective is the other way, and not by a little margin but at least 30 elo.

But another interpretation could be that your eval is superior to the one of your opponents and so in short TC matches (where eval gets more pressure) your engine is very strong. In long TC matches your opponents can compensate a bit their worse eval by searching longer (eval gets a bit less pressure).

As I don't have the resources to test at long TC anyway I don't care. I tune at short TC and at long TC it is as it is.

Why computing K that minimizes the sigmoid func. value?...

Re: Why computing K that minimizes the sigmoid func. value?.

Re: Why computing K that minimizes the sigmoid func. value?.

Re: Why computing K that minimizes the sigmoid func. value?.

Re: Why computing K that minimizes the sigmoid func. value?.

Re: Why computing K that minimizes the sigmoid func. value?.

Re: Why computing K that minimizes the sigmoid func. value?.

Re: Why computing K that minimizes the sigmoid func. value?.

Re: Why computing K that minimizes the sigmoid func. value?.

Re: Why computing K that minimizes the sigmoid func. value?.

Re: Why computing K that minimizes the sigmoid func. value?.