-
Notifications
You must be signed in to change notification settings - Fork 18.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Channel softmax #940
Channel softmax #940
Conversation
} | ||
} | ||
|
||
template <typename Dtype> | ||
__global__ void kernel_exp(const int num, const Dtype* data, Dtype* out) { | ||
CUDA_KERNEL_LOOP(index, num) { | ||
__global__ void kernel_exp(const int count, const Dtype* data, Dtype* out) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This kernel is the same as caffe_gpu_exp
isn't it? Let's remove it and replace with caffe_gpu_exp, unless I'm misunderstanding somehow. (I know it wasn't added this PR, but I just noticed it from seeing the diff.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't find caffe_gpu_exp
. I only found caffe_exp
, which calls vsExp
in MKL.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
whoops, my bad, I think I was thinking of caffe_gpu_powx
. caffe_gpu_exp
should probably exist but device abstraction (#610) will probably take care of this so never mind, sorry!
assigning to @longjon, go ahead and merge when you're happy with everything |
@shelhamer suggested offline adding a switch to provide the original "normalize over everything" mode. So, @shelhamer, if you still want to do that, you can append to or rewrite this PR. @shelhamer and others, which mode do we think should be the default? It seems like the channel normalization is usually what is desired, and I doubt anyone is relying on the current behavior, although it is a little jarring to change what layers do. If we do want the default to be the channel normalization, we could go ahead and merge this, and add a switch in a later PR. |
caffe_cpu_gemv<Dtype>(CblasTrans, channels, spatial_dim, 1, | ||
bottom_diff + i * dim, sum_multiplier_.cpu_data(), 0, scale_data); | ||
// restore the original top_diff in bottom_diff for subtraction | ||
caffe_copy(dim, top_diff + i * dim, bottom_diff + i * dim); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also note that the updated SoftmaxLayer CPU no longer allows in-place computations, since in CPU implementation bottom diff is first changed and then restored
caffe_mul(top[0]->count(), bottom_diff, top_data, bottom_diff);
while GPU implementation still allows in-place computations.
@jeffdonahue @shelhamer should we allow in-place computations in SoftmaxLayer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, good catch. I did this to avoid an extra loop, but I've added it back now to allow in-place computation. There should be no performance regression in the 1x1 case, and probably not a noticeable one in the general case, and anyway the GPU implementation is available.
In order to do this, I had to add functions to math_functions
for strided dot products (which of course cblas already supports, but we didn't previously have an interface for.)
@longjon merge as you please, as the switch can follow. I agree channel is On Saturday, August 16, 2014, longjon [email protected] wrote:
|
This provides a more direct interface to the cblas_?dot functions. This is useful, for example, for taking dot products across channels.
@ronghanghu I amended your commit with some aesthetic changes (make all the channel kernels have the form |
Fixed order of specialization and instantiation for clang++ build in ac64a7b. You can't call |
Softmax works across channels
Softmax works across channels
In this pull request, the behavior of SoftmaxLayer is changed from softmax over
channels*height*width
elements (all elements within a num) to softmax overchannels
elements (all elements at a spatial position within a num). This is for the purpose of running fully-connected layers as convolutions (see Net Surgery: http://nbviewer.ipython.org/github/BVLC/caffe/blob/master/examples/net_surgery.ipynb). It won't damage existing caffe examples, since fully-connected layer top blob haswidth==1
andheight==1
.The CPU version was implemented by @longjon, and I implemented the GPU version, including GPU backward.