Showing the live results of the SGD optimizer with a neural network upon implementing momentum (set to 0.5), with a starting learning rate of 1.0 and a decay of 1e-3.
Optimizers with live results:
Stochastic Gradient Descent:
Optimizer: SGD. Learning Rate: 1.0.
Optimizer: SGD. Learning Rate: 0.5.
Optimizer: SGD. Learning Rate: 1.0. Decay: 1e-2.
Optimizer: SGD. Learning Rate: 1.0. Decay: 1e-3.
Optimizer: SGD. Learning Rate: 1.0. Decay: 1e-3. Momentum: 0.5.
Optimizer: SGD. Learning Rate: 1.0. Decay: 1e-3. Momentum: 0.9.
AdaGrad:
Optimizer: AdaGrad. Decay: 1e-4
RMSProp:
Optimizer: RMSProp. Decay: 1e-4
Optimizer: RMSProp. Decay: 1e-5. rho: 0.999
Adam: