Training too slow and not using full GPU, whats the training time ? #2

shubhank008 · 2019-05-28T20:22:23Z

Using default parameters and all 300 categories, I feel its training quite slow even though I am using a AWS EC2 P2.xLarge instance with Nvidia K80 GPU.

Its using only 360MB of the GPU and I feel as if its stuck on that usage, its not using more or less from that number (checked via nvidia-smi command)

I tried calculating the time between each iteration and its 5-7 seconds, and calculating the total iterations with that time and 20 epochs, its results in more than 150 days.

vietnh1009 · 2019-07-23T23:38:10Z

From your screenshot, I would say that you didnt use gpu. Would you try to print some message after the line "if torch.cuda.is_available():" ?

shubhank008 · 2019-07-24T06:36:59Z

From your screenshot, I would say that you didnt use gpu. Would you try to print some message after the line "if torch.cuda.is_available():" ?

I do was using GPU, the 2nd image is showing that as the python process was running on GPU, its just that it was using only 300-400MB GPU instead of full 12GB, thus the issue.

I had now deleted my project since there was no reply I thought this is a dead repo.

vietnh1009 · 2019-07-24T07:03:17Z

Sorry for that. I was quite busy over the last months.
About the problem of GPU. If you want to use more GPU, you need to either increase the batch size or complexity of your model. Your gpu has 12 gb does not mean that any model run on your machine will utilize all the resource.
Bests,

vietnh1009 · 2019-07-24T10:36:53Z

Btw, if you want to train a model with 300 categories, increasing model's breadth and deepness is indispensable.

shubhank008 · 2019-07-24T12:28:41Z

Btw, if you want to train a model with 300 categories, increasing model's breadth and deepness is indispensable.

New to all these terms, can you provide me the parameters to use for whole data set ?
I would love to try this repo out again to see if it works for whole data set and how accurate.

Also, the way I wanted to try it was for saved image files, like I can just load a saved .png .jpg file of a drawing, feed it to the network and get the result. Somehow, all the repos I have come across either use webcam approach, javascript or recording/storing drawing coordinates and none had a approach to use saved image files (whether hand drawings or even quick draw dataset as saved image files).

I tried to pre-process the images to feed in network but the result is not the same.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training too slow and not using full GPU, whats the training time ? #2

Training too slow and not using full GPU, whats the training time ? #2

shubhank008 commented May 28, 2019

vietnh1009 commented Jul 23, 2019

shubhank008 commented Jul 24, 2019

vietnh1009 commented Jul 24, 2019

vietnh1009 commented Jul 24, 2019

shubhank008 commented Jul 24, 2019

Training too slow and not using full GPU, whats the training time ? #2

Training too slow and not using full GPU, whats the training time ? #2

Comments

shubhank008 commented May 28, 2019

vietnh1009 commented Jul 23, 2019

shubhank008 commented Jul 24, 2019

vietnh1009 commented Jul 24, 2019

vietnh1009 commented Jul 24, 2019

shubhank008 commented Jul 24, 2019