Optimization algorithm for training a neural net
Moderator: Ras
-
- Posts: 219
- Joined: Fri Apr 11, 2014 10:45 am
- Full name: Fabio Gobbato
Optimization algorithm for training a neural net
So far for training the neural network of my engine I have used batch gradient descent. I've read that Adam gives faster convergence and lower error. Have you tried both and which one works better? Are there big differences between the two?
-
- Posts: 1625
- Joined: Thu Jul 16, 2009 10:47 am
- Location: Almere, The Netherlands
Re: Optimization algorithm for training a neural net
With adaptive optimization methods like Adam you'll just need a fraction of the number of iterations to reach convergence, this is true. That Adam would give you lower error is simply not true, the consensus is that gradient descent is a more 'stable' algorithm than adaptive methods. You could also try to add 'momentum' to gradient descent, this will make it converge faster too.Fabio Gobbato wrote: ↑Thu Aug 29, 2024 7:12 pm So far for training the neural network of my engine I have used batch gradient descent. I've read that Adam gives faster convergence and lower error. Have you tried both and which one works better? Are there big differences between the two?
Most of the time I use AdamW to train my neural networks, this is a slightly modified Adam which handles weight-decay somewhat differently, I'm happy with it's performance.