- activation function - relu, sigmoid, tanh, and softmax.
- optimization algorithms - gradient descent, gradient descent with momentum, NAG, AdaGrad, RMSProp, and Adam.
The test.ipynb shows results on FASHION Mnist dataset using different configuration. Please refer to it.