Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add BiasLayer to add two Blobs with broadcasting #3550

Closed
wants to merge 7 commits into from

Conversation

jeffdonahue
Copy link
Contributor

This adds BiasLayer, designed analogously to ScalarLayer (#3021), to add blobs with arbitrary axes broadcasted. This could be used together with ScalarLayer to learn the batch norm scale and shift parameters. It could also be used independently anywhere in a network to learn a bias without a corresponding multiplication. And even more generally, it could be used to efficiently add two blobs with any number of corresponding axes, which can currently only be accomplished (in the most general case) rather inefficiently: with a pair of Reshapes and Tiles (to broadcast leading and trailing axes) followed by the Eltwise SUM operation.

This is currently based on ScalarLayer (for caffe.proto ID sequencing), with the last two commits being the relevant ones -- I'm happy to rebase this without ScalarLayer if we want to merge this before or without that.

Both this and ScalarLayer can either take two bottoms, specifying both inputs to the function, or take a single bottom and learn the second as a parameter.

A different approach for learning the BN scale/shift parameters that I haven't looked at yet is in #2996 (by @ducha-aiki), which learns both sets of parameters together. @cdoersch and I and anyone else interested (possibly @longjon and @shelhamer) should take a look at both and evaluate the benefits, with merge priority for any shared functionality given to @ducha-aiki's #2996 as the earlier PR.

Personally I do like the approach of having layers do as little as possible, which is why for my own work I've taken the approach of using two independent layers.

@jeffdonahue jeffdonahue changed the title Add BiasLayer to multiply two Blobs with broadcasting Add BiasLayer to add two Blobs with broadcasting Jan 13, 2016
@cdoersch
Copy link
Contributor

@longjon @shelhamer @jeffdonahue the discussion of the various options for channelwise affine operations has been happening in #3229...may be good to have all the discussion in one place.

@jeffdonahue
Copy link
Contributor Author

Replaced by #3591

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants