Siamese Networks / Distance Learning / Transfer Learning #697

zayd · 2014-07-15T03:11:01Z

Hi,

I am working on implementing a siamese network in caffe. The general pipeline for training I am thinking of right now is:

train a network on a N-way classification task
- remove the top 1 or 2 layers from the network
- create a copy of the network (using sharedweights)
- add a binary output distance layer that measures the distance between the two outputs

(2a) is where I believe the first change in caffe needs to be made. That is, using the representation learned by one deep network for some other task by changing the top 1 or 2 layers.

I put together a small hack that allows this in caffe by loading a state file of a trained network and passing an optional int remove_from_top to the function Solver::Restore and Net::CopyTrainedLayersFrom. This changes the behavior to only load the state of the first (total - remove_from_top) layers of the network. The rest of the layers specified in the new network's .prototxt file should initialize normally (because they are initialized before loading from state)

Do you have any suggestions or another preferred approach on how to tackle to this?

…replaced with different layers

shelhamer · 2014-07-15T08:46:44Z

Hey Zayd, nice to see you on the repo!

Caffe actually already understands how to do the initialization you're after for doing siamese networks. The documentation on finetuning is sadly lacking–we're working on a tutorial example–but Caffe loads weights and layers by resolving parameter names from the prototxt definition (model file) in the saved binary proto weights (pretrained weights file).

The steps will look like

define and train a network for N-way classification task
define the siamese network, which is a duplicate of the net from (1) except with whatever layers you do not want omitted, and a distance loss layer on top of siamese nets for the two inputs.
train the siamese network by calling finetune_net.bin on the definition from (2) and the weights from (1). the layers that have been carried over will be copied, and new layers will be initialized as you defined, and and any layers with the same "param" field name will share weights.

If you could document your work in this, at least for an elementary version of a siamese network, it would be an excellent example to include in Caffe! (I know there is interest from previous questions.)

So, long story short, siamese networks do not need a code change to Caffe. Please follow-up if I have missed anything, or if your change provides some useful convenience to steps 1-3 I described.

zayd · 2014-07-15T18:02:04Z

Hi Evan, thanks for the response! I will take a look at finetune_net.

On a related note, my understanding is that it is not possible to specify two input sources for a network with the existing framework. So for a siamese network, it wouldn't be possible to have two separate input layers (one that loads a.jpg and another that loads b.jpg). Is this correct? Would you suggest creating a layer (like image_data_layer) that loads up a pair of images?

…ed and replaced with different layers. Functionality already exists in Caffe

sguada · 2014-07-15T22:55:41Z

Actually yes you can have 2 or more image_data_layers, I have used for
other tasks and works well. Although you may want to have an specific way
of pairing images.

Sergio

2014-07-15 11:02 GMT-07:00 S. Zayd Enam [email protected]:

Hi Evan, thanks for the response! I will take a look at finetine_net.

On a related note, my understanding is that it is not possible to specify
two input sources for a network with the existing framework. So for a
siamese network, it wouldn't be possible to have two separate input layers
(one that loads a.jpg and another that loads b.jpg). Is this correct? Would
you suggest creating a layer (like image_data_layer) that loads up a pair
of images?

—
Reply to this email directly or view it on GitHub
#697 (comment).

shelhamer · 2014-07-16T09:30:07Z

To follow up in generality, Caffe understands arbitrary DAG models. You can have multiple inputs, different outputs, forking paths, and whatever.

shelhamer · 2014-07-30T06:02:43Z

Closing; this was a good question but not a PR.

A siamese network example in Caffe once you're done would be a nice PR!

wendlerc · 2014-08-01T19:54:53Z

@shelhamer how would you generate the leveldb when you e.g. want to assign one label to a pair of input images.

ashafaei · 2014-08-09T06:51:55Z

@shelhamer, Even though we have parameter sharing (#546) and Eltwise operations, from what I understand there are still couple of blocks missing. We need at least an Abs() operation, and if we wish to follow [1] we also should define a new LossLayer. Although DeepFace suggests using cross entropy loss after a layer that takes linear combination of absolute differences (Similar to #639)

Could you also verify this and let me know whether these are the remaining pieces to be added? If that's the case I'm willing to roll up my sleeves to finish it and prepare an example.

[1] S. Chopra, R. Hadsell, and Y. LeCun. Learning a similarity metric discriminatively, with application to face verification. In Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on, volume 1, pages 539–546. IEEE, 2005. http://yann.lecun.com/exdb/publis/pdf/chopra-05.pdf

cheer37 · 2016-03-11T14:04:17Z

@shelhamer
I am investigating the siamese network and saw the siamese example for mnist.
i wonder how two nets share weights there.
Thanks

Train a network with the top N layers from a saved state removed and …

3e6578d

…replaced with different layers

Revert Train a network with the top N layers from a saved state remov…

c1e8f6b

…ed and replaced with different layers. Functionality already exists in Caffe

shelhamer changed the title ~~Using the representation learned by one task for another task (siamese network)~~ Siamese Networks / Distance Learning / Transfer Learning Jul 18, 2014

shelhamer mentioned this pull request Jul 18, 2014

How to model a Siamese Network? #316

Closed

shelhamer closed this Jul 30, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Siamese Networks / Distance Learning / Transfer Learning #697

Siamese Networks / Distance Learning / Transfer Learning #697

zayd commented Jul 15, 2014

shelhamer commented Jul 15, 2014

zayd commented Jul 15, 2014

sguada commented Jul 15, 2014

shelhamer commented Jul 16, 2014

shelhamer commented Jul 30, 2014

wendlerc commented Aug 1, 2014

ashafaei commented Aug 9, 2014

cheer37 commented Mar 11, 2016

Siamese Networks / Distance Learning / Transfer Learning #697

Siamese Networks / Distance Learning / Transfer Learning #697

Conversation

zayd commented Jul 15, 2014

shelhamer commented Jul 15, 2014

zayd commented Jul 15, 2014

sguada commented Jul 15, 2014

shelhamer commented Jul 16, 2014

shelhamer commented Jul 30, 2014

wendlerc commented Aug 1, 2014

ashafaei commented Aug 9, 2014

cheer37 commented Mar 11, 2016