Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

warning: unused function 'IndexToOffset_999_get' #45

Open
pamio opened this issue Mar 27, 2017 · 5 comments
Open

warning: unused function 'IndexToOffset_999_get' #45

pamio opened this issue Mar 27, 2017 · 5 comments

Comments

@pamio
Copy link

pamio commented Mar 27, 2017

I'm using clnn to train a resent model on intel GPU.
When the training starts I see the warning below.

 THClReduce.cl build log: .................................... 32/634 ...................] ETA: 0ms | Step: 0ms
 <program source>:48:28: warning: unused function 'IndexToOffset_999_get'
  static inline unsigned int IndexToOffset_999_get(unsigned int linearId, global const TensorInfoCl    *info) {
                       ^

 THClReduce.cl build log:
 <program source>:67:19: warning: unused function 'IndexToOffset_999_get'
  static inline int IndexToOffset_999_get(int linearId, global const TensorInfoCl *info) {
              ^

   THClReduceAll.cl build log:
   <program source>:51:28: warning: unused function 'IndexToOffset_999_get'
    static inline unsigned int IndexToOffset_999_get(unsigned int linearId, global const TensorInfoCl  *info) {
                       ^
    <program source>:66:28: warning: unused function 'getLinearBlockId'
     static inline unsigned int getLinearBlockId() {
                       ^`

This is what I'm doing:

  if opt.backend == 'cl' then
      require 'clnn'
      require 'cltorch'
      net = net:cl()
      --cudnn.convert(net, cudnn) --Convert the net to cudnn 
      -- What is the equivalent of cud.convert for clnn ?
     criterion = criterion:cl()
  end

Is above code right ? Is there anything else that I need to do in order to use my intel GPU ?

Also I see that - train Loss: nan which should be a number ? Should I also convert the training loss value to cl ?

What else needs to be converted to cl ?

Best,
Pramod

@hughperkins
Copy link
Owner

hughperkins commented Mar 27, 2017 via email

@pamio
Copy link
Author

pamio commented Mar 27, 2017

Thats right! My question was with cudnn you convert your model to cudnn like this - cudnn.convert(net, cudnn). How do I convert with OpenCL ? Right now I'm just doing this net = net:cl(). is there any other additional steps like cudnn has ?

@pamio
Copy link
Author

pamio commented Mar 27, 2017

I guess I got it. Looks like what I'm doing is fine. I saw an example here Element-Research/rnn#41 . What I can't understand is that the loss is a nan and my test accuracy is way below expectation even after 60 epochs. Its 2.8% :(
I'm using OpenCL because the model runs out of memory soon after I start training on CPU. And on GPU I get these problems like nan and very low test accuracy.
Input images sizes are 224x224.
Any suggestion ?
PS: I'm trying out different networks like VGG, resent and alexnet. I could achieve an accuracy of 78% on cpu with VGG on CPU when the image sizes are 48x48.

@pamio
Copy link
Author

pamio commented Mar 27, 2017

Looks like this is causing issue - local loss = self.criterion:forward(output, target) and when I print the loss, it shows inf (infinity) which is why nan (there's a division by num of epochs). Any ideas on this ?

@pamio
Copy link
Author

pamio commented Mar 27, 2017

Interesting thing is - when I run the same on CPU I get the loss just fine (just that after few minutes it goes out of memory). Not sure what I'm missing here. This is the code I'm using - https://github.com/chsasank/plantvillage-challenge

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants