SPSA problems

Ralf Müller · Post by **Ralf Müller** » Mon Apr 03, 2017 1:33 am

Hi guys,

I've got two problems with my SPSA tryings for my own engine.
Problem 1: The engine is constantly crashing after ca. 300 iterations because of too much RAM use (I have 4 GB RAM)
Problem 2: The tuning seems to be broken. The values always increases, no matter if I take ridiculous high or small values

Any hints for either of the problems?
Thank you!

jdart · Post by **jdart** » Mon Apr 03, 2017 3:47 am

Re 2., you should verify that it converges if you run the tuning code on a smooth convex function with a known minimum.

SPSA has several hyperparameters and you may need to experiment to find workable values.

--Jon

Joerg Oster · Post by **Joerg Oster** » Mon Apr 03, 2017 4:01 am

Ralf Müller wrote:Hi guys,

I've got two problems with my SPSA tryings for my own engine.
Problem 1: The engine is constantly crashing after ca. 300 iterations because of too much RAM use (I have 4 GB RAM)
Problem 2: The tuning seems to be broken. The values always increases, no matter if I take ridiculous high or small values

Any hints for either of the problems?
Thank you!

Are you using Joona's perl script?
I never experienced any of the problems, and I did run MANY tuning sessions.

Ralf Müller · Post by **Ralf Müller** » Mon Apr 03, 2017 12:11 pm

Thanks for the reply!
I used this script: https://github.com/zamar/spsa

Ralf Müller · Post by **Ralf Müller** » Mon Apr 03, 2017 12:16 pm

Thanks!
I used this script and tried to change the values for ck and Rk.
How can I verify that it converges? Why it's so different with every engine and parameter?

Joerg Oster · Post by **Joerg Oster** » Mon Apr 03, 2017 12:38 pm

Ralf Müller wrote:Thanks for the reply!
I used this script: https://github.com/zamar/spsa

I never experienced any crashes with Stockfish using this script.

An idea to try is to always restart the engine before playing the minimatch.
Move the engine_init() right before the step where the 2 games are played,
and afterwards do engine_quit().
This will increase the duration time for the whole tuning process, but should be rock-solid.
I always do this in my local tuning runs to avoid possible issues with pawn and material hash because they don't get cleared after a 'ucinewgame' in Stockfish.
You can do a test run with 500 or 1,000 iterations to see, if it helps.

Regarding your second problem it's difficult to help without knowing what you want to tune,
and how you configured the tuning setup.

Feel free to PM me for further help.

BeyondCritics · Post by **BeyondCritics** » Tue Apr 04, 2017 11:34 pm

Ralf Müller wrote:Hi guys,
...
Problem 2: The tuning seems to be broken. The values always increases, no matter if I take ridiculous high or small values

This could be a symptom of overparametrization. Would that be possible?

Ralf Müller · Post by **Ralf Müller** » Wed Apr 05, 2017 10:47 am

Thanks for the reply!
The engine send wrong information about the mate score to the GUI so the tuning didn't work well. Now I fixed it and it seems to work.
But dependent on the ck the outcome is very different (f.e. with ck at 32). If I understand correctly the outcome should always be the same, only the speed on converging should be varify. Isn't that correct?

BeyondCritics · Post by **BeyondCritics** » Thu Apr 06, 2017 7:22 am

Ralf Müller wrote:Thanks for the reply!
...
But dependent on the ck the outcome is very different (f.e. with ck at 32). If I understand correctly the outcome should always be the same, only the speed on converging should be varify. Isn't that correct?

Unfortunately not. If you choose wrong parameters, you can mess it up completely. Furthermore, the spsa algorithm does not find the global optimum. If you set up a problem that has multiple local extrema, you must expect to get different results between runs.

Let's take a closer look. There are several parameters to play with.
In the example configuration file https://github.com/zamar/spsa/blob/master/aggr.conf we find:

Code: Select all

Iterations = 50000
A = 5000
Gamma = 0.101
Alpha = 0.602

Alpha and Gamma are set with recommendations from Spall, so maybe we should leave them at that. The "asymptotic optimal values" (i.e for very long experiments) are Alpha= 1.0 and Gamma=1/6.
You must have Alpha > Gamma, otherwise the values will simply blow up in magnitude.
Decide on the number of iterations you want to have. One iteration consists of two games with reversed colors.
Spall recommends to set A = 0.1*iterations, therefore this is presumably preset.
Next look at the parameters for each variable https://github.com/zamar/spsa/blob/master/aggr.var

Code: Select all

Aggressiveness,30,0,200,10,0.0020,0

they are documented in section 4 of the README https://github.com/zamar/spsa

Code: Select all

Variables file &#40;name of the file is defined in configuration file&#41; is a comma separated &#40;CSV&#41; file.
Columns are defined as follows&#58;

Column 0&#58; Variable name &#40;alphanumeric string&#41;
Column 1&#58; Variable initial value &#40;float&#41;
Column 2&#58; Variable minimum value &#40;float&#41;
Column 3&#58; Variable maximum value &#40;float&#41;
Column 4&#58; Perturbation "ck" for the last iteration &#40;float&#41;
Column 5&#58; Relative apply factor "Rk" for the last iteration &#40;float&#41;
Column 6&#58; For simulation mode, this defines the ELO decrease for point x = (+/-) 100 compared to point x = 0 &#40;optimum&#41; &#40;float&#41;.

Notes&#58; 
- When ck is defined for the last iteration and the number of iterations is known, it's easy to derive a value for c.
- When ck and Rk are defined for the last iterations, it's easy to derive ak for the last iteration. Based on that we can derive value for a.

We learn, that the floats ck and Rk refers to the c_k and r_k of the last iteration. c_k is the stepsize tried with the parameter and R_k is the relative percentage of this step actually taken on success.
Therefore in this example you try a (clipped) step of 10 in each direction and later you add 0.002*10 = 0.02 into the direction of the winner.
You need therefore a difference of 50 won games to earn one point of "Aggressivness".

Ralf Müller · Post by **Ralf Müller** » Thu Apr 06, 2017 9:38 am

Thanks for your answer!

BeyondCritics wrote:Unfortunately not. If you choose wrong parameters, you can mess it up completely. Furthermore, the spsa algorithm does not find the global optimum. If you set up a problem that has multiple local extrema, you must expect to get different results between runs.

What kind of chess evaluation function could have multiple local extrema?
Isn't it possible to find a global optimum if I choose a high ck?

SPSA problems

SPSA problems

Re: SPSA problems

Re: SPSA problems

Re: SPSA problems

Re: SPSA problems

Re: SPSA problems

Re: SPSA problems

Re: SPSA problems

Re: SPSA problems

Re: SPSA problems