CPU only support + timeline trace support + training improvements from other pull requests #28

ahmedammar · 2016-11-19T22:03:06Z

Relates to #24

ahmedammar · 2016-11-19T22:04:39Z

I haven't had a chance to test it on a machine with a GPU yet, but it should be ok.

Avoid the problem of training speed become slower iteration by iteration.

…ange.

ahmedammar · 2016-11-21T18:50:57Z

With both patches from @philokey and @ICapalija I am able to train much more efficiently now. N.B. I had to modify @ICapalija's patch slightly above.

This isn't using VGG16 so your milage may vary:

iter: 67900 / 70000, total loss: 0.6029, rpn_loss_cls: 0.0366, rpn_loss_box: 0.2336, loss_cls: 0.0485, loss_box: 0.2843, lr: 0.000100
speed: 0.126s / iter
iter: 67910 / 70000, total loss: 0.9322, rpn_loss_cls: 0.0310, rpn_loss_box: 0.1026, loss_cls: 0.2336, loss_box: 0.5650, lr: 0.000100
speed: 0.126s / iter
iter: 67920 / 70000, total loss: 0.6732, rpn_loss_cls: 0.0729, rpn_loss_box: 0.2227, loss_cls: 0.1087, loss_box: 0.2690, lr: 0.000100
speed: 0.126s / iter
iter: 67930 / 70000, total loss: 0.9137, rpn_loss_cls: 0.0306, rpn_loss_box: 0.2416, loss_cls: 0.1938, loss_box: 0.4477, lr: 0.000100
speed: 0.126s / iter

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 367.48                 Driver Version: 367.48                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla K80           On   | 0000:00:1E.0     Off |                    0 |
| N/A   82C    P0   120W / 149W |  10941MiB / 11439MiB |     78%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|    0      2815    C   python                                       10937MiB |
+-----------------------------------------------------------------------------+

ahmedammar force-pushed the master branch 2 times, most recently from 0d28d15 to 5c809ca Compare November 19, 2016 23:11

ahmedammar changed the title ~~Add support for cpu-only mode.~~ Add support for cpu-only mode. Also enable use of TF's work sharders. Nov 19, 2016

ahmedammar mentioned this pull request Nov 19, 2016

Specify which GPU to use #20

Closed

ahmedammar force-pushed the master branch from 5c809ca to fdc3214 Compare November 20, 2016 03:28

ahmedammar mentioned this pull request Nov 20, 2016

Fail to build in CPU only environment #24

Open

ahmedammar force-pushed the master branch 2 times, most recently from 6d8adf2 to a6db9e0 Compare November 21, 2016 03:23

philokey and others added 2 commits November 21, 2016 18:25

Update the way of decaying the learning rate.

7fded7e

Avoid the problem of training speed become slower iteration by iteration.

Stop adding new ops to the graph object with every snapshot and lr ch…

b520cff

…ange.

ahmedammar force-pushed the master branch from a6db9e0 to 0a0abf8 Compare November 21, 2016 16:26

ahmedammar changed the title ~~Add support for cpu-only mode. Also enable use of TF's work sharders.~~ Add support for cpu-only mode. Also enable use of TF's work sharders. And includes both other pull requests. Nov 21, 2016

ahmedammar mentioned this pull request Nov 21, 2016

Stop adding new ops to the graph object with every snapshot and lr change. #16

Open

ahmedammar force-pushed the master branch from 0a0abf8 to 6d1e570 Compare November 21, 2016 17:09

ahmedammar changed the title ~~Add support for cpu-only mode. Also enable use of TF's work sharders. And includes both other pull requests.~~ CPU only support and training improvements from other pull requests Nov 21, 2016

ahmedammar force-pushed the master branch from 6d1e570 to 7109302 Compare November 21, 2016 19:12

ahmedammar mentioned this pull request Nov 21, 2016

Can this code run in multiple GPUs? #29

Closed

ahmedammar changed the title ~~CPU only support and training improvements from other pull requests~~ CPU only support + timeline trace support + training improvements from other pull requests Nov 21, 2016

Ahmed Ammar added 2 commits November 22, 2016 12:18

Add support for cpu-only mode. Also enable use of TF's work sharders.

4fe957a

Add support for timeline generation.

3e81c38

ahmedammar force-pushed the master branch from 7109302 to 3e81c38 Compare November 22, 2016 10:18

smallcorgi merged commit 25091bb into smallcorgi:master Nov 24, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CPU only support + timeline trace support + training improvements from other pull requests #28

CPU only support + timeline trace support + training improvements from other pull requests #28

ahmedammar commented Nov 19, 2016 •

edited

Loading

ahmedammar commented Nov 19, 2016

ahmedammar commented Nov 21, 2016 •

edited

Loading

CPU only support + timeline trace support + training improvements from other pull requests #28

CPU only support + timeline trace support + training improvements from other pull requests #28

Conversation

ahmedammar commented Nov 19, 2016 • edited Loading

ahmedammar commented Nov 19, 2016

ahmedammar commented Nov 21, 2016 • edited Loading

ahmedammar commented Nov 19, 2016 •

edited

Loading

ahmedammar commented Nov 21, 2016 •

edited

Loading