Live SGD Optimization for neural network with a 1e-3 Decaying Learning Rate from 1.0, along with momentum (0.5). Epilepsy Warning, there are quick flashing colors.

Live SGD Optimization for neural network with a 1e-3 Decaying Learning Rate from 1.0, along with momentum (0.5). Epilepsy Warning, there are quick flashing colors.

Live SGD Optimization for neural network with a 1e-3 Decaying Learning Rate from 1.0, along with momentum (0.5). Epilepsy Warning, there are quick flashing colors.

Showing the live results of the SGD optimizer with a neural network upon implementing momentum (set to 0.5), with a starting learning rate of 1.0 and a decay of 1e-3.

Optimizers with live results:

Stochastic Gradient Descent:

Optimizer: SGD. Learning Rate: 1.0.

Optimizer: SGD. Learning Rate: 0.5.

Optimizer: SGD. Learning Rate: 1.0. Decay: 1e-2.

Optimizer: SGD. Learning Rate: 1.0. Decay: 1e-3.

Optimizer: SGD. Learning Rate: 1.0. Decay: 1e-3. Momentum: 0.5.

Optimizer: SGD. Learning Rate: 1.0. Decay: 1e-3. Momentum: 0.9.

AdaGrad:

Optimizer: AdaGrad. Decay: 1e-4

RMSProp:

Optimizer: RMSProp. Decay: 1e-4

Optimizer: RMSProp. Decay: 1e-5. rho: 0.999

Adam:

Optimizer: Adam. Learning Rate: 0.02. Decay: 1e-5

Optimizer: Adam. Learning Rate: 0.05. Decay: 5e-7