Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RMSprop clean up and rebase #2867

Merged
merged 1 commit into from
Aug 9, 2015
Merged

RMSprop clean up and rebase #2867

merged 1 commit into from
Aug 9, 2015

Conversation

ronghanghu
Copy link
Member

Rebased and adapted RMSprop implementation #1890 to the new solver interface #2518 and #1977. The original author is @erogol. Pulled against master instead of dev.

The RMSprop solver is based on G. Hinton's lecture (http://www.cs.toronto.edu/~tijmen/csc321/slides/lecture_slides_lec6.pdf). Param gradients are divided by average root mean square of gradients in recent batches. It can be seen as a mini-batch version of using only the sign of gradients.

Update rule:

MeanSquare(t) = rms_decay * MeanSquare(t-1) + (1 - rms_decay) * gradient(t)^2
param_update(t) = gradient(t) / (sqrt(MeanSquare(t)) + delta)

Momentum is not supported for RMSprop solver, as in #1890.

@erogol
Copy link
Contributor

erogol commented Aug 7, 2015

thanks for handling this :)

@@ -521,7 +531,7 @@ TYPED_TEST(NesterovSolverTest, TestNesterovLeastSquaresUpdateWithMomentum) {
const Dtype kMomentum = 0.5;
const int kNumIters = 1;
for (int i = 0; i <= kNumIters; ++i) {
this->TestLeastSquaresUpdate(kLearningRate, kWeightDecay, kMomentum, i);
this->TestLeastSquaresUpdate(kLearningRate, kWeightDecay, kMomentum, 0., i);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should be declared as constants (e.g. const Dtype kRMSDecay = 0) like the other args to make the meaning clear.

@jeffdonahue
Copy link
Contributor

Thanks @erogol for the original work and thanks @ronghanghu for the rebase. This looks good except as noted above.

@ronghanghu
Copy link
Member Author

@jeffdonahue OK, I'll handle them. Thanks for the comments!

@ronghanghu ronghanghu force-pushed the rms-prop branch 2 times, most recently from 3e8ab30 to fbd0533 Compare August 7, 2015 19:46
@ronghanghu
Copy link
Member Author

Fixed those issues. I expect this PR to be merged after #2856 and #2782.

@jeffdonahue
Copy link
Contributor

Cool, LGTM. @ronghanghu feel free to merge whenever it's easiest for you, before or after the other two PRs.

@ronghanghu ronghanghu force-pushed the rms-prop branch 2 times, most recently from 7f61b86 to abe99e8 Compare August 9, 2015 06:44
Implement RMSProp solver and cleaned up to adjust to new solver interface that uses
accumulated gradients and refactored regularization.
@ronghanghu
Copy link
Member Author

Took a further rebase on #2866. Authorship preserved for @erogol in commit

Ready to merge.

ronghanghu added a commit that referenced this pull request Aug 9, 2015
@ronghanghu ronghanghu merged commit 698fc76 into BVLC:master Aug 9, 2015
@ronghanghu ronghanghu deleted the rms-prop branch August 9, 2015 07:37

protected:
virtual void InitSolver(const SolverParameter& param) {
this->solver_.reset(new RMSPropSolver<Dtype>(param));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you set the RMS decay here, instead of introducing the decay argument to least squares and snapshotting tests? Since it is unique to this solver I think it is best handled here.

@shelhamer
Copy link
Member

@ronghanghu Sorry I didn't catch this earlier, but I have a suggestion for the RMS decay parameter in the tests. Instead of introducing another argument and setting it for every test, this param could be set by the RMSProp test class for encapsulation. Could you send a follow-up PR to make this change?

@ronghanghu ronghanghu restored the rms-prop branch August 9, 2015 07:55
@ronghanghu
Copy link
Member Author

@shelhamer Yes, I can send another PR to do that. Adam solver is also going to introduce a momentum2 parameter, which can be handle in the same way (put into InitSolver()).

Addressed in #2888.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants