It is not so unsound. It is like the SPSA algorithm, except SPSA does not use self-play. You can read about SPSA there, if you are interested:zamar wrote:The method is a practical approach and not mathematically very sound. Because algorithm is very simple, it's very

likely already invented a long time ago.

http://www.jhuapl.edu/SPSA/

In order to guarantee convergence of SPSA, it is necessary to decay the deltas and learning rate in time.

As I mentioned in my paper, SPSA has the potential to be close in performance to CLOP, but its main weakness (as Joona says) is that it is very difficult to choose good values for all its meta-parameters. In my experiments, SPSA with optimal meta-parameters performs like CLOP. But in practice, it is not possible to find the optimal meta-parameters of SPSA, so I'd prefer using CLOP.

I did not understand the part about ampli-bias knobs.

Rémi