- Authors: Haoran Pu, Yucen, Sun.
This project is a comprehensive study between first order optimizers and second order optimizers in both theoretical and practical realms.
- First Order Gradient Descent Analysis
- AdaHessian Implementation
- ADAHESSIAN: An Adaptive Second Order Optimizer for Machine Learning
- The Space Complexity of Approximating the Frequency Moments
DLsys_final_project_report.pdf
: Our final report.ada_hessian.py
: Adahessian Implementation for Experimentation.experiment-torch.ipynb
: Our experiment code, just run each block one by one and you should get the result. (Note: it is extremely slow)data
: This contains the experiment data based on our run,cv_new
are used forpresentation
andfinal analysis
andcv_old
are for demonstration of Pytorch default implementation has accuracy issue forCifar10
. Inside the directorydata_analysis.ipynb
is the code for making the visualizations.