-
Notifications
You must be signed in to change notification settings - Fork 18.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deduplicate solver regularization, logging, and local rates and decays #2518
Conversation
I just looked over this -- looks great! This refactoring was much needed. Thanks @cypof and @shelhamer. Only issue I see is I'm not sure about the verb in |
Actually, I guess it is replacing |
No the net_->Update() needs to be in, so that it is not executed by solvers that are not of type SGDSolver. This way, only the root solver in a parallel setup will apply the update. That was actually the race I introduced when I split the big PR into small ones. Another name could be ApplyGradients? |
Okay, I see -- thanks for the explanation @cypof. I like |
Designate `Solver::ApplyUpdate()` as the core method to compute and apply parameter updates given the current state of the Net. Make `Solver::ComputeUpdateValue()` a subordinate call overloaded by the `SGDSolver`s to take care of optimization algorithm details.
Thanks for the comments @cypof and @jeffdonahue. I went with |
p.s. ignore the travis push check -- the travis Pr check is the one to heed. The push check was triggered by my accidental push to BVLC/caffe and then my deleting the branch made it fail. |
Cool, LGTM |
Deduplicate solver regularization, logging, and local rates and decays
This simplifies the solver code by de-duplicating shared logic.
Solver::Iteration()
toSolver::MakeUpdate()
to verb.
I plan to merge this shortly to make way for an updated #1977.