-
Notifications
You must be signed in to change notification settings - Fork 18.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nd conv pool #2824
base: master
Are you sure you want to change the base?
Nd conv pool #2824
Conversation
This reverts commit c5c8e17a9a604941faad46710e28ab4c8aa5602d.
I checked this branch. but I couldn't run, because of error "num_axess() <= 4" in blob.hpp and base_conv_layer.cpp. |
The two new layers I added don't use the base_conv or base_pool class. Can I see your prototxt file ? |
@WGW101 Thank you for response. I modified Convolution to NdConvolution. But it show an error "not implemented yet". I didn't use CUDNN. It made this error. Is it right? |
Yes, unfortunately it only works with cudnn for now. For an implementation of Nd convolution with the caffe engine, see PR #2049 by Jeff Donahue. |
@WGW101 I'm using #2049 and #2442. I think second one is better. In addition, I'm touching 3D convolution for action classification from video to extract spatial and temporal feature. I'm very confusing to handle networks which are blobs, e.g weights of conv, pool and ip. Because ND data can't use matcaffe. Do you have any idea for this? |
@Yeongtae The python interface is quite easy to understand and very similar to what Matlab could look like. Let's say you load your network like this: Then the weights are available like this: and the biases like this: That would work for Conv and IP, not for pooling as it has no parameters.. |
@WGW101 Follow your advices, I have solved my problem. I'm verifying that 3D convolution is right, using the convn function in matlab. After testing, I check an weird result. As an input is ones(n,n,n), the difference between a caffe result and a matlab result shows that all element are same to 1.0e-0.6* -0.4992. Therefore, It means that 3D convolution of this branch and matlab are different. Do you have any idea for this? |
Using imfilter with the region without padding, it shows very small error 1.0e-0.6*n |
Hi! I noticed that CuDNN was updated in this week (In v3 RC, cudnnAddTensor was not supported). So I checked with the new release, and this PR works fine just changing the function cudnnAddTensor to cudnnAddTensor_v3 (in the new API, second parameter 'mode' was removed). Thanks! |
@squall815 Hi ! Thanks for your feedback ! I'm sorry I wasn't able to test this PR myself with the new version of CuDNN as my hardware isn't supported by CUDA 7.0 (required for cudnn v3 ..) I hope I'll be able to resume the development of this branch some day (cleaning up everything to pass all tests, adding CPU / Caffe Engine with #2049 and #2442 integrated with the BlobShape message and the separated layer to keep the best performances in 2D etc..) |
@Yeongtae I trained volume data, and the input is in hdf5 format.When I use matcaffe to parse caffemodel, I got an error below.Do you know how to solve it? |
I just use pycaffe. |
@Yeongtae What's about your input data format?Do you use hdf5? |
Yes. I used it. |
Do you need some example? |
@Yeongtae Yeah, it couldn't be better! |
Hey, @WGW101, I wanna know whether your current version support Nd convolution with Cudnn? |
@rockstone533 Hi ! Yes it should if you don't have biases. Sorry I can't test it myself for material incompatibility reasons... |
@WGW101 Yeah, I've changed it and my model began to work. However, the speed seems a bit slow. How about your running speed? @squall815 |
@WGW101, I used this promotion but I was no sure about train-val.prototxt layer setting. Here is what I did so far. I use libcudnn.so.7.0.
pooling_param { I have a question here. I think I have to increase a number of kernel_size because I have 3D data 20x20x20 so I have to set 2 more kernel_size but I always got below message " Error parsing text-format caffe.NetParameter: 47:16: Non-repeated field "kernel_size" is specified multiple times" Promotion #2049 required to increase the number of kernel_size in order to train 3D data but this promotion required me to set kernel_shape instead of increasing a number of kernel_size. So, I think I did not get 3D layer. I could train my layer till Iteration 0, Testing net (#0) but I got the error below. I think my kernel_shape setting was not correct. F1208 15:44:10.254520 28197 cudnn_ndconv_layer.cu:43] Check failed: status == CUDNN_STATUS_SUCCESS (3 vs. 0) CUDNN_STATUS_BAD_PARAM |
@ToruHironaka Hi ! First you shouldn't use In the current implementation the Be careful not to confuse the shape of your kernel and the shape of your input.
If any error persist feel free to ask for help again. Regards |
@WGW101, thanks for your reply, I really appreciated it. I tried below layer model but I got the same problem. <omitted data layer, I use hdf5 datasets > layer { layer { Error: I1209 19:53:55.656716 7984 net.cpp:155] Setting up data I think my pooling layer causing above error. My cuda is 7.0, cudnn is v3.0, and I have Titan X so my setting should be okay or I might miss something such as path setting or other things. Do I miss something else? I tried to use "ReLU" in this promotion but I could not use it. Why can't I use layer type "ReLU" in this promotion? I could use it in #2442 promotion. Thanks! |
@WGW101, I solved it by referencing @squall815 above but I still have problems about ReLU layer. Does this promotion support Nd-LRN? Thanks! |
I could train my hdf5 datasets with this promotion of caffe but my trainings have never completed so far. Accuracy = 0.5 or less and loss = 1.7 or above. I think my hdf5 datasets or network settings, were wrong. I posted my pyhton scripts for creating hdf dataset and my network setting below. Please help me out. My python script, which convert images files into hdf5 dataset: def image2HDF5(inputFile, outputDir, fileType, width, height, channel):
|
Hi !
Following my issue ticket #2671, here is my pull request for nD convolution and pooling using CuDNN primitives.
The nD convolution by itself seems to work, but the biases addition using cudnnAddTensor() returns NOT_SUPPORTED status.
The nD pooling doesn't work, and returns NOT_SUPPORTED.
Apparently the nD pooling descriptor might only be a place-holder in this version, so this might work with future version of cuDNN...
I inherits my layers directly from the Layer class and not the BaseConvolutionLayer and BasePoolingLayer to avoid modifying any existing (and working..) features.
The major drawback of this approach is that it won't be able to rely on other engines if CuDNN is not supported by the user configuration. But as I declared a LayerFactory for NdConvolution and NdPooling, it might be relatively easy to solve this behaviour.
Don't hesitate to give me feedbacks on this two new layers,
and share any new insight about why it doesn't work.
I'm already aware of the #2049 PR for nD convolution, but I'm still missing nD pooling (actually I only need 3D pooling in my application).
Cheers,