Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Load weights from multiple caffemodels. #1456

Conversation

jyegerlehner
Copy link
Contributor

At least one use case requiring this is doing layerwise or "stacked" autoencoder training: First I train the newly-added encoder and decoder layers by themselves (using features extracted from the net having only the previously-trained layers). Then when I begin to train the combined network, it needs to pull weights from two different caffemodel files. So this change allows the --weights parameter to be a comma-separated list of caffemodels instead of just a single caffemodel.

The other code change is that the test nets are also initialized from the provided caffemodels, not just the train net. So if the trained net is a subset of the test net, then some of the test nets' layers' weights would be uninitialized, whereas with this change they are initialized from the specified models.

@shelhamer
Copy link
Member

This is a helpful generalization for certain uses. Note that this can
currently be done by loading several models in Python and assigning weights
between them, as in net surgery, but making the caffe tool understand
multi-model weight loading could be a nice convenience.

The level and stage rules for layer inclusion / exclusion are helpful for
layer-wise learning and model variations too.
On Thu, Nov 20, 2014 at 11:26 jyegerlehner [email protected] wrote:

At least one use case requiring this is doing layerwise or "stacked"
autoencoder training: First I train the newly-added encoder and decoder
layers by themselves (using features extracted from the net having only the
previously-trained layers). Then when I begin to train the combined
network, it needs to pull weights from two different caffemodel files. So
this change allows the --weights parameter to be a comma-separated list
of caffemodels instead of just a single caffemodel.

The other code change is that the test nets are also initialized from the
provided caffemodels, not just the train net. So if the trained net is a
subset of the test net, then some of the test nets' layers' weights would
be uninitialized, whereas with this change they are initialized from the

specified models.

You can merge this Pull Request by running

git pull https://github.com/jyegerlehner/caffe load-weights-from-multiple-caffemodels

Or view, comment on, or merge it at:

#1456
Commit Summary

  • Load weights from multiple caffemodels.

File Changes

Patch Links:


Reply to this email directly or view it on GitHub
#1456.

@shelhamer
Copy link
Member

With the advances of pycaffe one can copy weights from several models by Net.copy_from.

I like preparing the nets through Python for its generality, but copying weights from multiple nets could be a useful special case. However I'm inclined to keep the caffe tool simple -- thoughts @longjon @jeffdonahue ?

@jeffdonahue
Copy link
Contributor

I think it's useful and non-intrusive. I'm not a huge fan of the interface (commas in a flag argument) but I can't think of anything better (gflags doesn't let you specify the same flag multiple times and give you a vector<string>, does it?). I did this at one point by adding a repeated weights field to SolverParameter, but SolverParameter isn't a very good place for it...

shelhamer added a commit that referenced this pull request Mar 8, 2015
…-caffemodels

Load weights from multiple models by listing comma separated caffemodels
as the `-weights` arg to the caffe command.
@shelhamer
Copy link
Member

@jyegerlehner thanks for the convenient multi-model fine-tuning initialization. I merged this to master in a9bf7b9 (and collapsed this to a single commit).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants