You are welcome to your opinion. However, I genuinely believe this work is an improvement on TD based methods for learning from self play, and thus would like to encourage others to try the idea. If the current set of experiments aren't so convincing, maybe I can add some more material to my webpage to address this...mcostalba wrote: Actually IMHO you have not validated anything. Instead you have _tried_ to prove that your algorithm is better then others. But these are two different and _separated_ goals and the first should be attempted before the second.
Suppose I did what you suggested earlier, and repeated the experiment 10 times (I don't have the CPU time to think about any more) with different sets of random starting weights and reported the results. Assuming they were similar to what was reported, would this be validation enough for you? If not, what would it take?
From my point of view, I actually am not that interested in just chess. When it comes down to it, there is no need to learn from scratch using self-play with chess, for any kind of practical reason. A better validation of the idea (in my opinion) would be seeing the method reliably working across a number of different games, across a number of different implementations.
It has nothing at all to do with 'defending' anything.mcostalba wrote: For your information who said that mathematics is difficult actually is a mathematician. He, very kindly, used the word 'difficult' instead of the more rude 'useless', because he is a gentleman
I am not a mathematician but I can read the paper either and I can understand the notation, and because I am _not_ a gentleman I add that if an idea is sound doesn't need to be 'defended' by an abstract notation, but can be expressed in a simple working code snippet that at the end of the day, is more precise anyway, because you have to unambiguously define all the details.
The reason to present the technique abstractly is to highlight the fact that the method (in principle) can work with alternate forms of parametrized evaluation functions (that can be optimized using gradient descent), rather than the linear evaluation function used in the paper. For example, you could train an ANN with it (though I am not saying it would work well).
I at least thought this might be useful to someone. Honestly.
Is this parametrized version of Stockfish available as open source? Sounds like it would be the perfect test-bed for some of the ideas in the paper...mcostalba wrote: Yes, I dare to say too and I also dare to say more useful and effective. If the effort of parametrize all the tuning parameters of a real engine scares the people I can say that we, for our auto-tuning, use a complete parametrized version of Stockfish where all the evaluation parameters (and not only them) are reachable and changeable as UCI options: many hundreds of parameter can be set and changed on the fly without restarting the engine. I would also add that in a real engine you have to proceed for parameter's families because is naive to even think to optimize all the values in one go.