I changed tucano network to 768x512x1 architecture and was able to do the training and soon I will be releasing version 11. Version 10 had the bigger architecture similar to stockfish halfkp. I decided to that so I can build the training code and network eval. It has been a good learning experience.
Anyways the only parameter I'm not understanding is the sigmoid scale, which is used in the formula below:
Code: Select all
double tnn_sigmoid(double value)
{
return 1.0f / (1.0f + exp(-value * SIGMOID_SCALE));
}
Code: Select all
double sigmoid_prime = output_sigmoid * (1.0f - output_sigmoid) * SIGMOID_SCALE;
I see some other values in other training code and read somewhere that this can be related to the data itself.
I appreciate if someone can shed some light on that. Thanks!
Alcides.