Add ScalarLayer to multiply two Blobs with broadcasting #3021
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This adds
ScalarLayer
, which takes two Blobs and (in effect) multiplies them elementwise, after broadcasting the axes of the second Blob to match the first as necessary.For example, if bottom[0] has shape
(2, 3, 4, 5)
and bottom[1] has shape(3, 4)
andaxis == 1
, then the computation of this layer is equivalent to reshaping bottom[1] to(1, 3, 4, 1)
, then tiling it to(2, 3, 4, 5)
, then multiplying the result elementwise with bottom[0].In the most general case,
Backward
to bottom[1] is accomplished with elementwise multiplication followed by 2gemv
s. For special cases (when bottom[1]'s shape corresponds to the beginning or end of bottom[0]'s shape, e.g. if it were instead shape(2, 3)
andaxis == 0
or shape(4, 5)
withaxis == 2
) one or both of thegemv
s is skipped (or replaced with a dot product).My use case for this comes from #2033 -- I am replacing the hacky
coeff_blob
I added toEltwise
to perform the binary multiplications with this layer. It could also replace the channel-wise scalar inPReLU
(I think this backward implementation is faster), or be used to learn a channel-wise scalar after batch normalization.Thanks to @longjon for the name for this layer and initial implementation of a previous version.