Skip to content

Commit

Permalink
Tensorflow work for DIGITS by Ethan (#7)
Browse files Browse the repository at this point in the history
* Fix visualization when palette is None (NVIDIA#1177)

The palette may be `None`when working with grayscale labels.
Fix NVIDIA#1147

* Bugfix for customizing previous models (NVIDIA#1202)

* [Packaging] Disable tests (NVIDIA#1227)

* [Tests] Skip if extension not installed (NVIDIA#1263)

* [Docs] Fix spelling errors in comments

* [Docs] Add note about torch pkg and cusparse (NVIDIA#1303)

* [Docs] Add note about torch pkg and cusparse (NVIDIA#1303)

* [Caffe] Fix batch accumulation bug (NVIDIA#1307)

* Use official NVIDIA model store by default (NVIDIA#1308)

* Mark v5.0.0

* [Packaging] Pull latest docker image before build

* bAbI data plug-in

Add utils

Add inference form to bAbI dataset

Allow inference without answer

Allow unknown words in BaBI data plug-in

Fix bAbI plugin Lint errors

* Tensorflow integration updates

Use TFRecords for TF inference

TF: Don't rescale inputs

Fix some TF classification tests

Remove unnecessary print

Fix TF imports when uninstalled

Fix mean image scale

Fix generic model tests

Fix Torch single image inference

Fix inference

TMP TF Lint

Revert changes in digits-lint script

Lint: ignore tensorflow standard examples

More Lint fixes

* Add .pgm to list of supported image file formats

* Restrict usage of cmap to labels DB in generic dataset exploration

fix NVIDIA#1322

* Update Object Detection example doc (NVIDIA#1323)

* Update Object Detection example doc (NVIDIA#1323)

* [TravisCI] Cache local OpenBLAS build

This fixes a Torch bug we've been having on Travis for a while now.

We had only been building OpenBLAS from source when there was no cached
torch build present on the build machine. That meant you could get a
cached build of Torch which was built against one version of OpenBLAS,
but the system actually installed an older version. This led to memory
corruption and segmentation faults.

* [Tests] Skip if extension not installed (part 2) (NVIDIA#1337)

* [TravisCI] Install all plugins by default

Also test no plugins

* [Tests] Skip if extension not installed (NVIDIA#1337)

* Add gradient hook

* Add memn2n model

* [Docs] Update model store documentation (NVIDIA#1346)

TODO: add a screenshot of the official model store once approved

* [Docs] Update model store documentation (NVIDIA#1346)

TODO: add a screenshot of the official model store once approved

* Add steps to specify the Python layer file (NVIDIA#1347)

* Add steps to specify the Python layer file (NVIDIA#1347)

* [Docs] Install minimal boost libs for caffe

* Update memn2n with gradient hooks

* Remove the selenium walkthrough

* GAN example

* Make batch size variable

* Training/inference paths

* Small update to TF 0.12

* Snapshot names, float inference, restore all vars

* Update copyright year for 2017

* Add a few missing copyright notices

* Fix Siamese example

Broadcast -1 into all elements that equal 0 in original label.

* Fix Siamese example (NVIDIA#1405)

Broadcast -1 into all elements that equal 0 in original label.

* [Packaging] Make nginx site easier to customize

* Do not restore global_step or optimizer variables

* Add TB link

* Update GAN network

* Dynamically select inference form

* TF inference: convert images to float

* Update GAN z-gen network

* Small Update model view layout

* Add GAN plug-ins

* Fix documentation typo. train.txt and test.txt was swapped and shown
in the wrong folders for mnist and cifar10 data sets.

* Update GAN plug-in to create CelebA dataset

* Document a cuDNN workaround for text example (NVIDIA#1422)

* Document a cuDNN workaround for text example (NVIDIA#1422)

* Add ability to show input in ImageOutput extension

* Add all data to raw data view extension

* Add model for CelebA dataset

* Update GAN data plug-in

* Update all losses in one session

* Remove conversion to .png in GAN data plug-in

* Correct shebang for prepare_pascal_voc_data.sh (NVIDIA#1450)

* [Docs] Document workaround for torch+hdf5 error

* Fix typo in ModelStore.md

* Fix typo in medical-imaging/README.md

* TF Slim Lenet example

Divide input by 255

* Update GAN data plug-in

* Fix TF model snapshot

* Reduce scheduler delays to speed up inference

* Update GAN plugins

* Fix TF tests

* Add API to LmdbReader (used by gan_features.py)

* Save animated gif

* Add GAN walk-through

* Update GAN walkthrough with embeddings video

* Fix GAN view for list encoding

* Fix bash lint with shellcheck

* Fix bugs when visiting nested image folder

* Add animation task to GAN plugins

* Fix shellcheck-related bug in PPA upload script

* Add view task to see image attributes

* Copy labels.txt inside the dataset

Move import to the top

* Fix Distribution Graph

Move backwards-compatibility to setstate

* Fix typo in Sunnybrook plug-in

* Add comments to GAN models

* Update README

* Fix GAN features script

* Fix a bug introduced when fixing shellcheck lint

* GAN app

* Fix another shellcheck-related bug

* Fix table formatting in README.md

Fix table formatting

* Fix DIGITS inference

* Adjust GAN window size automatically

* Add attributes to GAN app

* Move gandisplay.py

* Remove wxpython 3.0 selection

* Fix call to model

* Clamp distance values from segementation boundaries before begin
converted to uint8. That was causing banding in the image because of
wrapping at V % 256

* lint

* [Docs] 5.0 debs and Ubuntu 16.04 support

* Adding disclaimer

* Display the filename of the image that caused the exception while
loading.

* Ported DIGITS to using tensorflow 1.1.0.

* Ported DIGITS to using tensorflow 1.1.0.
Got master branch working

* Fix softmax visualization by scaling to image range

* added the official store image and updated the documentation

* added the official store image and updated the documentation (NVIDIA#1650)

* [TravisCI] Add `git fetch --unshallow` for DIST

Useful for TravisCI builds in forks.

* updated gitignore

* first cherrypick for installation scripts

* Tf install experimental (#2)

* Fix visualization when palette is None (NVIDIA#1177)

The palette may be `None`when working with grayscale labels.
Fix NVIDIA#1147

* Bugfix for customizing previous models (NVIDIA#1202)

* [Packaging] Disable tests (NVIDIA#1227)

* [Tests] Skip if extension not installed (NVIDIA#1263)

* [Docs] Fix spelling errors in comments

* [Docs] Add note about torch pkg and cusparse (NVIDIA#1303)

* [Docs] Add note about torch pkg and cusparse (NVIDIA#1303)

* [Caffe] Fix batch accumulation bug (NVIDIA#1307)

* Use official NVIDIA model store by default (NVIDIA#1308)

* Mark v5.0.0

* [Packaging] Pull latest docker image before build

* Add .pgm to list of supported image file formats

* Restrict usage of cmap to labels DB in generic dataset exploration

fix NVIDIA#1322

* Update Object Detection example doc (NVIDIA#1323)

* Update Object Detection example doc (NVIDIA#1323)

* [TravisCI] Cache local OpenBLAS build

This fixes a Torch bug we've been having on Travis for a while now.

We had only been building OpenBLAS from source when there was no cached
torch build present on the build machine. That meant you could get a
cached build of Torch which was built against one version of OpenBLAS,
but the system actually installed an older version. This led to memory
corruption and segmentation faults.

* [Tests] Skip if extension not installed (part 2) (NVIDIA#1337)

* [TravisCI] Install all plugins by default

Also test no plugins

* [Tests] Skip if extension not installed (NVIDIA#1337)

* [Docs] Update model store documentation (NVIDIA#1346)

TODO: add a screenshot of the official model store once approved

* [Docs] Update model store documentation (NVIDIA#1346)

TODO: add a screenshot of the official model store once approved

* Add steps to specify the Python layer file (NVIDIA#1347)

* Add steps to specify the Python layer file (NVIDIA#1347)

* [Docs] Install minimal boost libs for caffe

* Remove the selenium walkthrough

* Update copyright year for 2017

* Add a few missing copyright notices

* Fix Siamese example

Broadcast -1 into all elements that equal 0 in original label.

* Fix Siamese example (NVIDIA#1405)

Broadcast -1 into all elements that equal 0 in original label.

* [Packaging] Make nginx site easier to customize

* Fix documentation typo. train.txt and test.txt was swapped and shown
in the wrong folders for mnist and cifar10 data sets.

* Document a cuDNN workaround for text example (NVIDIA#1422)

* Document a cuDNN workaround for text example (NVIDIA#1422)

* Correct shebang for prepare_pascal_voc_data.sh (NVIDIA#1450)

* [Docs] Document workaround for torch+hdf5 error

* Fix typo in ModelStore.md

* Fix typo in medical-imaging/README.md

* Fix bash lint with shellcheck

* Fix bugs when visiting nested image folder

* Fix shellcheck-related bug in PPA upload script

* Copy labels.txt inside the dataset

Move import to the top

* Fix Distribution Graph

Move backwards-compatibility to setstate

* Fix typo in Sunnybrook plug-in

* Fix a bug introduced when fixing shellcheck lint

* Fix another shellcheck-related bug

* Fix table formatting in README.md

Fix table formatting

* Clamp distance values from segementation boundaries before begin
converted to uint8. That was causing banding in the image because of
wrapping at V % 256

* lint

* [Docs] 5.0 debs and Ubuntu 16.04 support

* WIP lint fix

* Linted most of what I can lint prior to asking for context

* updated the model store urls in the readme

* added debugs in build scripts to understand the point of failure

* added travis wait to install openblas

* removed tensorflow to the build process to see if affects openblas

* removed suppressing log contents

* added set -x

* fixed control

* re-enabling tensorflow to see if travis builds

* updated the version of numpy to ensure a stable build for travis wrt
to open issue 8653 on numpy github

* forcing numpy to v 1.8.1

* added the official store image and updated the documentation (NVIDIA#1650)

* [TravisCI] Add `git fetch --unshallow` for DIST

Useful for TravisCI builds in forks.

* Got travis script to work for tensorflow installation

* removed the open blas stuff that somehow made it into here

* embarassing merge residue

* force install specific numpy version because 1.13 was being installed

* asdf

* trying changing the tensorflow install

* reodered the installation order to see if it builds due to TF using
numpy 1.13 now

* Cleaning installation to work with Numpy 1.3 upgrade

removed the open blas stuff that somehow made it into here

embarassing merge residue

force install specific numpy version because 1.13 was being installed

asdf

trying changing the tensorflow install

reodered the installation order to see if it builds due to TF using
numpy 1.13 now

* Tf example (#3)

* inital work on autoencoder TF example

* Moved the example files to its proper location

* atempting to get autoencoder to work

* autoencoder work

* validated tensorflow autoencoder example

* updated gitignore

* disabled comments in the segmentation-model.lua script to prevent
crashing

* commiting the changes made to binary segmentation tf

* adding work to do something else

* I am seriously wayy too tired to write this commit message, it's just
random bits of stuff

* got binary seg and siamese working

* started to work on the regression network

* milestone

* got regression for TF working

* Got fine tuning to work in TF

* changed the code to the format that is wanted by tim and greg

* Finished all the work for examples

inital work on autoencoder TF example

Moved the example files to its proper location

atempting to get autoencoder to work

autoencoder work

validated tensorflow autoencoder example

updated gitignore

disabled comments in the segmentation-model.lua script to prevent
crashing

commiting the changes made to binary segmentation tf

adding work to do something else

I am seriously wayy too tired to write this commit message, it's just
random bits of stuff

got binary seg and siamese working

rebase

rebase

started to work on the regression network

milestone

got regression for TF working

Got fine tuning to work in TF

changed the code to the format that is wanted by tim and greg

got fine tuning working

* Some small fixes

* changes WRT PR

trying renaming the weights

tested renaming variables

* fixed api problem for multi gpus

* changes to example documentation

* git removed installing tests

* updated most of linting

* Removed unused block of code as per suggestion by Greg

* Removing spaces...

* Script changes for tensorflow (#1)

* Basic Tensorflow Support

Added some initial tf tools

Implemented UI

Fixes for tensorflow 0.10

Removed tf-slim as its not part of the 0.10 master

Added the lmdb reader with a tf.cond that needs replacement

Implemented train and val seperation with a templating

Fixed issue with dequeueing both runners by pulling both graphs

Implemented training and validation rythm

Added support for both png and jpg and added 16 bit support

Implemented mean subtraction - but needs rework to load as constant

Added an optimized implementation of mean subtraction

Further optimized the mean loading by using a shared constant

Wrapped the data loader in a factory to easily support more data types

Implemented cropping

Implemented floating point support. Implemented seperate LMDB database.
Implemented regression support. Added some brief nosetests. Need to
invoke accuracy only on classification though.

Implemented variable restoration. Needs thorough testing

Implemented inferencing, not entirely polished

Moved some code into functions, started on modularization a bit

Implemented digits custom helper functions

Implemented custom printing ops

Implemented autoencoder

total rewrite of summaries

Implemented output to console from scalar summaries

Fixes for summary outputs: only simple scalar values are parsed to
console

Implemented binary segmentation and necessary fixes

Some updates on binary seg

Implemented all possible optimizers and started work on learning rate
shaper

Started work on the lr policies

Fixes for learning_rates, implemented optimizers, tested variable
summary output to UI

Implemented and tested all learning rates and optimizers

Introduces new model definition and improvements in loss handling and
graph layout

Major refactoring of main code. Implemented new model description.
Implemented and tested inferencing. Implemented and tested
weight/snapshot loading.

All-round minor updates and fixes

Fixes in summary cumulator and implemented an RNN model

Fixes for mean subtraction in tf and tf-ui, implemented data order
selection in image-view extension

Implemented support for mean file of format: png, jpg, binaryproto - the
latter being the fault that DIGITS will provide.

Added support for runtime statistics and some allround fixes

Added static tensorboard style network visualization for tensorflow.
Added output of traces (no vis yet). Added a loader while waiting for
network vis. Minor syntax cleanups.

Implemented alexnet standard network

Pulled in updates for travis build and added tensorflow install

Added two more files for Mr Travis

Implemented tensorflow configuration

Added tf config to doc

Fixes for ubuntu deployment of tf.

Moved tf tools

Fixes for tf ubuntu

Fixes for tf ubuntu

Some fixes and updates for TF in Travis

Fix in network viz test

Implemented default sinlge-gpu support and some nosetests

Fixes for inference

Added siamese network, bugfixes, minor features, some utility tf
functions

Added siamese network and example png

Better error-ui format for network viz

Added an alternative simpler siamese network that doesnt need a seperate
db, minor error update

Preliminary version of hdf5 implemented

Implemented fine-tuning by renaming variables

Implemented visualisation of variables and the activations of the Ops
they belong to.

Fix in inf vis naming

Fixes in visualualisation shapes and naming

Implemented softmax upon classification

Implemented all nosetests for tf classification, and many allround
bugfixes

Implemented generic nosetests - some need work

Fix for travis to find python exe

Implemented a better file format deducer, and implemented a bare minimal
TFRecord-reader

Added top_n accuracy shortcut

Implemented on-line data augmentation for TF, 5 types. Some minor
bugfixes. Need to do something with image whitening though during
validation and inf..

Added tensorflow data augmentation test

Minor fixes and improvements from linter

Implemented minimal and bare multigpu and fixes to get it running for
greg

Preliminary version of tfrecord writer for classification

Some changes to optimize dataloading for tfr

More fixes for tfrecrods

Fix generic data loading

Minor breaking changes but updates in namescoping

Implemented new model structure. Improvements to multi-gpu handling.
Updates to namespaces. Implemented accounting for regularization. Many
allround updates

Implemented proper visualisation for gpu devices

Minor updates and converted alexnet and vgg16 to new format

Fix in tfrecord shape

WIP on timeline traces

Finalized support for tensorflow timeline traces

Fixed alexnet for tf

Fix merge errors

Minify tf-graph-basic.build.js

* bAbI data plug-in

Add utils

Add inference form to bAbI dataset

Allow inference without answer

Allow unknown words in BaBI data plug-in

Fix bAbI plugin Lint errors

* Tensorflow integration updates

Use TFRecords for TF inference

TF: Don't rescale inputs

Fix some TF classification tests

Remove unnecessary print

Fix TF imports when uninstalled

Fix mean image scale

Fix generic model tests

Fix Torch single image inference

Fix inference

TMP TF Lint

Revert changes in digits-lint script

Lint: ignore tensorflow standard examples

More Lint fixes

* Add gradient hook

* Add memn2n model

* Update memn2n with gradient hooks

* GAN example

* Make batch size variable

* Training/inference paths

* Small update to TF 0.12

* Snapshot names, float inference, restore all vars

* Do not restore global_step or optimizer variables

* Add TB link

* Update GAN network

* Dynamically select inference form

* TF inference: convert images to float

* Update GAN z-gen network

* Small Update model view layout

* Add GAN plug-ins

* Update GAN plug-in to create CelebA dataset

* Add ability to show input in ImageOutput extension

* Add all data to raw data view extension

* Add model for CelebA dataset

* Update GAN data plug-in

* Update all losses in one session

* Remove conversion to .png in GAN data plug-in

* TF Slim Lenet example

Divide input by 255

* Update GAN data plug-in

* Fix TF model snapshot

* Reduce scheduler delays to speed up inference

* Update GAN plugins

* Fix TF tests

* Add API to LmdbReader (used by gan_features.py)

* Save animated gif

* Add GAN walk-through

* Update GAN walkthrough with embeddings video

* Fix GAN view for list encoding

* Add animation task to GAN plugins

* Add view task to see image attributes

* Add comments to GAN models

* Update README

* Fix GAN features script

* GAN app

* Fix DIGITS inference

* Adjust GAN window size automatically

* Add attributes to GAN app

* Move gandisplay.py

* Remove wxpython 3.0 selection

* Fix call to model

* Adding disclaimer

* Ported DIGITS to using tensorflow 1.1.0.

* Ported DIGITS to using tensorflow 1.1.0.
Got master branch working

* updated gitignore

* first cherrypick for installation scripts

* Tf install experimental (#2)

* Fix visualization when palette is None (NVIDIA#1177)

The palette may be `None`when working with grayscale labels.
Fix NVIDIA#1147

* Bugfix for customizing previous models (NVIDIA#1202)

* [Packaging] Disable tests (NVIDIA#1227)

* [Tests] Skip if extension not installed (NVIDIA#1263)

* [Docs] Fix spelling errors in comments

* [Docs] Add note about torch pkg and cusparse (NVIDIA#1303)

* [Docs] Add note about torch pkg and cusparse (NVIDIA#1303)

* [Caffe] Fix batch accumulation bug (NVIDIA#1307)

* Use official NVIDIA model store by default (NVIDIA#1308)

* Mark v5.0.0

* [Packaging] Pull latest docker image before build

* Add .pgm to list of supported image file formats

* Restrict usage of cmap to labels DB in generic dataset exploration

fix NVIDIA#1322

* Update Object Detection example doc (NVIDIA#1323)

* Update Object Detection example doc (NVIDIA#1323)

* [TravisCI] Cache local OpenBLAS build

This fixes a Torch bug we've been having on Travis for a while now.

We had only been building OpenBLAS from source when there was no cached
torch build present on the build machine. That meant you could get a
cached build of Torch which was built against one version of OpenBLAS,
but the system actually installed an older version. This led to memory
corruption and segmentation faults.

* [Tests] Skip if extension not installed (part 2) (NVIDIA#1337)

* [TravisCI] Install all plugins by default

Also test no plugins

* [Tests] Skip if extension not installed (NVIDIA#1337)

* [Docs] Update model store documentation (NVIDIA#1346)

TODO: add a screenshot of the official model store once approved

* [Docs] Update model store documentation (NVIDIA#1346)

TODO: add a screenshot of the official model store once approved

* Add steps to specify the Python layer file (NVIDIA#1347)

* Add steps to specify the Python layer file (NVIDIA#1347)

* [Docs] Install minimal boost libs for caffe

* Remove the selenium walkthrough

* Update copyright year for 2017

* Add a few missing copyright notices

* Fix Siamese example

Broadcast -1 into all elements that equal 0 in original label.

* Fix Siamese example (NVIDIA#1405)

Broadcast -1 into all elements that equal 0 in original label.

* [Packaging] Make nginx site easier to customize

* Fix documentation typo. train.txt and test.txt was swapped and shown
in the wrong folders for mnist and cifar10 data sets.

* Document a cuDNN workaround for text example (NVIDIA#1422)

* Document a cuDNN workaround for text example (NVIDIA#1422)

* Correct shebang for prepare_pascal_voc_data.sh (NVIDIA#1450)

* [Docs] Document workaround for torch+hdf5 error

* Fix typo in ModelStore.md

* Fix typo in medical-imaging/README.md

* Fix bash lint with shellcheck

* Fix bugs when visiting nested image folder

* Fix shellcheck-related bug in PPA upload script

* Copy labels.txt inside the dataset

Move import to the top

* Fix Distribution Graph

Move backwards-compatibility to setstate

* Fix typo in Sunnybrook plug-in

* Fix a bug introduced when fixing shellcheck lint

* Fix another shellcheck-related bug

* Fix table formatting in README.md

Fix table formatting

* Clamp distance values from segementation boundaries before begin
converted to uint8. That was causing banding in the image because of
wrapping at V % 256

* lint

* [Docs] 5.0 debs and Ubuntu 16.04 support

* WIP lint fix

* Linted most of what I can lint prior to asking for context

* updated the model store urls in the readme

* added debugs in build scripts to understand the point of failure

* added travis wait to install openblas

* removed tensorflow to the build process to see if affects openblas

* removed suppressing log contents

* added set -x

* fixed control

* re-enabling tensorflow to see if travis builds

* updated the version of numpy to ensure a stable build for travis wrt
to open issue 8653 on numpy github

* forcing numpy to v 1.8.1

* added the official store image and updated the documentation (NVIDIA#1650)

* [TravisCI] Add `git fetch --unshallow` for DIST

Useful for TravisCI builds in forks.

* Got travis script to work for tensorflow installation

* Cleaning installation to work with Numpy 1.3 upgrade

removed the open blas stuff that somehow made it into here

embarassing merge residue

force install specific numpy version because 1.13 was being installed

asdf

trying changing the tensorflow install

reodered the installation order to see if it builds due to TF using
numpy 1.13 now

* Tf example (#3)

* inital work on autoencoder TF example

* Moved the example files to its proper location

* atempting to get autoencoder to work

* autoencoder work

* validated tensorflow autoencoder example

* updated gitignore

* disabled comments in the segmentation-model.lua script to prevent
crashing

* commiting the changes made to binary segmentation tf

* adding work to do something else

* I am seriously wayy too tired to write this commit message, it's just
random bits of stuff

* got binary seg and siamese working

* started to work on the regression network

* milestone

* got regression for TF working

* Got fine tuning to work in TF

* changed the code to the format that is wanted by tim and greg

* Finished all the work for examples

inital work on autoencoder TF example

Moved the example files to its proper location

atempting to get autoencoder to work

autoencoder work

validated tensorflow autoencoder example

updated gitignore

disabled comments in the segmentation-model.lua script to prevent
crashing

commiting the changes made to binary segmentation tf

adding work to do something else

I am seriously wayy too tired to write this commit message, it's just
random bits of stuff

got binary seg and siamese working

rebase

rebase

started to work on the regression network

milestone

got regression for TF working

Got fine tuning to work in TF

changed the code to the format that is wanted by tim and greg

got fine tuning working

* Some small fixes

* changes WRT PR

trying renaming the weights

tested renaming variables

* fixed api problem for multi gpus

* git removed installing tests

* updated most of linting

* Removed unused block of code as per suggestion by Greg

* Removing spaces...

* Tf documentation (#4)

* Worked on Tensorflow docs

* milestone

* changed some typos

* added into the documentation for how to specify which weights to train

* removed the open blas stuff that somehow made it into here

* embarassing merge residue

* force install specific numpy version because 1.13 was being installed

* asdf

* trying changing the tensorflow install

* changed docs for freezing variables

* added more to the documentation

* capitalized some letters

* fixed api problem for multi gpus

* fixes to docs WRT to PR

* changes WRT to PR comments

* added the cudnn versioning problem with tf

* added images for tensorflow image

* updated dl for tensorflow to 1.2

* updated pip command

* Greg gan work (#3)

GAN support for DIGITS

* Tensorflow Work

* Fix visualization when palette is None (NVIDIA#1177)

The palette may be `None`when working with grayscale labels.
Fix NVIDIA#1147

* Bugfix for customizing previous models (NVIDIA#1202)

* [Packaging] Disable tests (NVIDIA#1227)

* [Tests] Skip if extension not installed (NVIDIA#1263)

* [Docs] Fix spelling errors in comments

* [Docs] Add note about torch pkg and cusparse (NVIDIA#1303)

* [Docs] Add note about torch pkg and cusparse (NVIDIA#1303)

* [Caffe] Fix batch accumulation bug (NVIDIA#1307)

* Use official NVIDIA model store by default (NVIDIA#1308)

* Mark v5.0.0

* [Packaging] Pull latest docker image before build

* bAbI data plug-in

Add utils

Add inference form to bAbI dataset

Allow inference without answer

Allow unknown words in BaBI data plug-in

Fix bAbI plugin Lint errors

* Tensorflow integration updates

Use TFRecords for TF inference

TF: Don't rescale inputs

Fix some TF classification tests

Remove unnecessary print

Fix TF imports when uninstalled

Fix mean image scale

Fix generic model tests

Fix Torch single image inference

Fix inference

TMP TF Lint

Revert changes in digits-lint script

Lint: ignore tensorflow standard examples

More Lint fixes

* Add .pgm to list of supported image file formats

* Restrict usage of cmap to labels DB in generic dataset exploration

fix NVIDIA#1322

* Update Object Detection example doc (NVIDIA#1323)

* Update Object Detection example doc (NVIDIA#1323)

* [TravisCI] Cache local OpenBLAS build

This fixes a Torch bug we've been having on Travis for a while now.

We had only been building OpenBLAS from source when there was no cached
torch build present on the build machine. That meant you could get a
cached build of Torch which was built against one version of OpenBLAS,
but the system actually installed an older version. This led to memory
corruption and segmentation faults.

* [Tests] Skip if extension not installed (part 2) (NVIDIA#1337)

* [TravisCI] Install all plugins by default

Also test no plugins

* [Tests] Skip if extension not installed (NVIDIA#1337)

* Add gradient hook

* Add memn2n model

* [Docs] Update model store documentation (NVIDIA#1346)

TODO: add a screenshot of the official model store once approved

* [Docs] Update model store documentation (NVIDIA#1346)

TODO: add a screenshot of the official model store once approved

* Add steps to specify the Python layer file (NVIDIA#1347)

* Add steps to specify the Python layer file (NVIDIA#1347)

* [Docs] Install minimal boost libs for caffe

* Update memn2n with gradient hooks

* Remove the selenium walkthrough

* GAN example

* Make batch size variable

* Training/inference paths

* Small update to TF 0.12

* Snapshot names, float inference, restore all vars

* Update copyright year for 2017

* Add a few missing copyright notices

* Fix Siamese example

Broadcast -1 into all elements that equal 0 in original label.

* Fix Siamese example (NVIDIA#1405)

Broadcast -1 into all elements that equal 0 in original label.

* [Packaging] Make nginx site easier to customize

* Do not restore global_step or optimizer variables

* Add TB link

* Update GAN network

* Dynamically select inference form

* TF inference: convert images to float

* Update GAN z-gen network

* Small Update model view layout

* Add GAN plug-ins

* Fix documentation typo. train.txt and test.txt was swapped and shown
in the wrong folders for mnist and cifar10 data sets.

* Update GAN plug-in to create CelebA dataset

* Document a cuDNN workaround for text example (NVIDIA#1422)

* Document a cuDNN workaround for text example (NVIDIA#1422)

* Add ability to show input in ImageOutput extension

* Add all data to raw data view extension

* Add model for CelebA dataset

* Update GAN data plug-in

* Update all losses in one session

* Remove conversion to .png in GAN data plug-in

* Correct shebang for prepare_pascal_voc_data.sh (NVIDIA#1450)

* [Docs] Document workaround for torch+hdf5 error

* Fix typo in ModelStore.md

* Fix typo in medical-imaging/README.md

* TF Slim Lenet example

Divide input by 255

* Update GAN data plug-in

* Fix TF model snapshot

* Reduce scheduler delays to speed up inference

* Update GAN plugins

* Fix TF tests

* Add API to LmdbReader (used by gan_features.py)

* Save animated gif

* Add GAN walk-through

* Update GAN walkthrough with embeddings video

* Fix GAN view for list encoding

* Fix bash lint with shellcheck

* Fix bugs when visiting nested image folder

* Add animation task to GAN plugins

* Fix shellcheck-related bug in PPA upload script

* Add view task to see image attributes

* Copy labels.txt inside the dataset

Move import to the top

* Fix Distribution Graph

Move backwards-compatibility to setstate

* Fix typo in Sunnybrook plug-in

* Add comments to GAN models

* Update README

* Fix GAN features script

* Fix a bug introduced when fixing shellcheck lint

* GAN app

* Fix another shellcheck-related bug

* Fix table formatting in README.md

Fix table formatting

* Fix DIGITS inference

* Adjust GAN window size automatically

* Add attributes to GAN app

* Move gandisplay.py

* Remove wxpython 3.0 selection

* Fix call to model

* Clamp distance values from segementation boundaries before begin
converted to uint8. That was causing banding in the image because of
wrapping at V % 256

* lint

* [Docs] 5.0 debs and Ubuntu 16.04 support

* Adding disclaimer

* Display the filename of the image that caused the exception while
loading.

* Ported DIGITS to using tensorflow 1.1.0.

* Ported DIGITS to using tensorflow 1.1.0.
Got master branch working

* Fix softmax visualization by scaling to image range

* added the official store image and updated the documentation

* added the official store image and updated the documentation (NVIDIA#1650)

* [TravisCI] Add `git fetch --unshallow` for DIST

Useful for TravisCI builds in forks.

* updated gitignore

* first cherrypick for installation scripts

* Tf install experimental (#2)

* Fix visualization when palette is None (NVIDIA#1177)

The palette may be `None`when working with grayscale labels.
Fix NVIDIA#1147

* Bugfix for customizing previous models (NVIDIA#1202)

* [Packaging] Disable tests (NVIDIA#1227)

* [Tests] Skip if extension not installed (NVIDIA#1263)

* [Docs] Fix spelling errors in comments

* [Docs] Add note about torch pkg and cusparse (NVIDIA#1303)

* [Docs] Add note about torch pkg and cusparse (NVIDIA#1303)

* [Caffe] Fix batch accumulation bug (NVIDIA#1307)

* Use official NVIDIA model store by default (NVIDIA#1308)

* Mark v5.0.0

* [Packaging] Pull latest docker image before build

* Add .pgm to list of supported image file formats

* Restrict usage of cmap to labels DB in generic dataset exploration

fix NVIDIA#1322

* Update Object Detection example doc (NVIDIA#1323)

* Update Object Detection example doc (NVIDIA#1323)

* [TravisCI] Cache local OpenBLAS build

This fixes a Torch bug we've been having on Travis for a while now.

We had only been building OpenBLAS from source when there was no cached
torch build present on the build machine. That meant you could get a
cached build of Torch which was built against one version of OpenBLAS,
but the system actually installed an older version. This led to memory
corruption and segmentation faults.

* [Tests] Skip if extension not installed (part 2) (NVIDIA#1337)

* [TravisCI] Install all plugins by default

Also test no plugins

* [Tests] Skip if extension not installed (NVIDIA#1337)

* [Docs] Update model store documentation (NVIDIA#1346)

TODO: add a screenshot of the official model store once approved

* [Docs] Update model store documentation (NVIDIA#1346)

TODO: add a screenshot of the official model store once approved

* Add steps to specify the Python layer file (NVIDIA#1347)

* Add steps to specify the Python layer file (NVIDIA#1347)

* [Docs] Install minimal boost libs for caffe

* Remove the selenium walkthrough

* Update copyright year for 2017

* Add a few missing copyright notices

* Fix Siamese example

Broadcast -1 into all elements that equal 0 in original label.

* Fix Siamese example (NVIDIA#1405)

Broadcast -1 into all elements that equal 0 in original label.

* [Packaging] Make nginx site easier to customize

* Fix documentation typo. train.txt and test.txt was swapped and shown
in the wrong folders for mnist and cifar10 data sets.

* Document a cuDNN workaround for text example (NVIDIA#1422)

* Document a cuDNN workaround for text example (NVIDIA#1422)

* Correct shebang for prepare_pascal_voc_data.sh (NVIDIA#1450)

* [Docs] Document workaround for torch+hdf5 error

* Fix typo in ModelStore.md

* Fix typo in medical-imaging/README.md

* Fix bash lint with shellcheck

* Fix bugs when visiting nested image folder

* Fix shellcheck-related bug in PPA upload script

* Copy labels.txt inside the dataset

Move import to the top

* Fix Distribution Graph

Move backwards-compatibility to setstate

* Fix typo in Sunnybrook plug-in

* Fix a bug introduced when fixing shellcheck lint

* Fix another shellcheck-related bug

* Fix table formatting in README.md

Fix table formatting

* Clamp distance values from segementation boundaries before begin
converted to uint8. That was causing banding in the image because of
wrapping at V % 256

* lint

* [Docs] 5.0 debs and Ubuntu 16.04 support

* WIP lint fix

* Linted most of what I can lint prior to asking for context

* updated the model store urls in the readme

* added debugs in build scripts to understand the point of failure

* added travis wait to install openblas

* removed tensorflow to the build process to see if affects openblas

* removed suppressing log contents

* added set -x

* fixed control

* re-enabling tensorflow to see if travis builds

* updated the version of numpy to ensure a stable build for travis wrt
to open issue 8653 on numpy github

* forcing numpy to v 1.8.1

* added the official store image and updated the documentation (NVIDIA#1650)

* [TravisCI] Add `git fetch --unshallow` for DIST

Useful for TravisCI builds in forks.

* Got travis script to work for tensorflow installation

* removed the open blas stuff that somehow made it into here

* embarassing merge residue

* force install specific numpy version because 1.13 was being installed

* asdf

* trying changing the tensorflow install

* reodered the installation order to see if it builds due to TF using
numpy 1.13 now

* Cleaning installation to work with Numpy 1.3 upgrade

removed the open blas stuff that somehow made it into here

embarassing merge residue

force install specific numpy version because 1.13 was being installed

asdf

trying changing the tensorflow install

reodered the installation order to see if it builds due to TF using
numpy 1.13 now

* Tf example (#3)

* inital work on autoencoder TF example

* Moved the example files to its proper location

* atempting to get autoencoder to work

* autoencoder work

* validated tensorflow autoencoder example

* updated gitignore

* disabled comments in the segmentation-model.lua script to prevent
crashing

* commiting the changes made to binary segmentation tf

* adding work to do something else

* I am seriously wayy too tired to write this commit message, it's just
random bits of stuff

* got binary seg and siamese working

* started to work on the regression network

* milestone

* got regression for TF working

* Got fine tuning to work in TF

* changed the code to the format that is wanted by tim and greg

* Finished all the work for examples

inital work on autoencoder TF example

Moved the example files to its proper location

atempting to get autoencoder to work

autoencoder work

validated tensorflow autoencoder example

updated gitignore

disabled comments in the segmentation-model.lua script to prevent
crashing

commiting the changes made to binary segmentation tf

adding work to do something else

I am seriously wayy too tired to write this commit message, it's just
random bits of stuff

got binary seg and siamese working

rebase

rebase

started to work on the regression network

milestone

got regression for TF working

Got fine tuning to work in TF

changed the code to the format that is wanted by tim and greg

got fine tuning working

* Some small fixes

* changes WRT PR

trying renaming the weights

tested renaming variables

* fixed api problem for multi gpus

* changes to example documentation

* git removed installing tests

* updated most of linting

* Removed unused block of code as per suggestion by Greg

* Removing spaces...

* Tf documentation (#4)

* Worked on Tensorflow docs

* milestone

* changed some typos

* added into the documentation for how to specify which weights to train

* removed the open blas stuff that somehow made it into here

* embarassing merge residue

* force install specific numpy version because 1.13 was being installed

* asdf

* trying changing the tensorflow install

* changed docs for freezing variables

* added more to the documentation

* capitalized some letters

* fixed api problem for multi gpus

* fixes to docs WRT to PR

* changes WRT to PR comments

* added the cudnn versioning problem with tf

* added images for tensorflow image

* updated dl for tensorflow to 1.2

* updated pip command

fixed linting

removed debug lines in scripts

* cleaning up residues from the travis script

* fixed a broken link

* cleaned up more residue

* somehow openblas made it through merge

* lint fix

* updated documentation for using tensorboard

* added warnings for using tensorboard not on chrome

* changed to using bootbox.alert()

* initial commit for googlenet implementation

* mile stone on inception module

* finished googlenet inference

* finished googlenet and refactored a bit of other networks

* commiting to test this at the office

* fixed googlenet to get it working

* somehow a bad version went through

* switching to documentation

* removed softmax before loss in googlenet

* initial prototype fix

* updated gitignore

* updated googlenet with the best working model and add a note about the
auxillary branches

* lint

* initial prototype fix

* lint

* removed debug prints

* lint

* Tf gans review (NVIDIA#9)

* initial commit for googlenet implementation

* mile stone on inception module

* finished googlenet inference

* finished googlenet and refactored a bit of other networks

* commiting to test this at the office

* fixed googlenet to get it working

* somehow a bad version went through

* switching to documentation

* removed softmax before loss in googlenet

* initial prototype fix

* updated gitignore

* a typo made through

* edited the gan examples to be compatable with TF 1.2

* lint

* added to test optimizers other than sgd

* pointed the celeba dataset to its main page

* removed a tf-events file

* Googlenet implementation

initial commit for googlenet implementation

mile stone on inception module

finished googlenet inference

finished googlenet and refactored a bit of other networks

commiting to test this at the office

fixed googlenet to get it working

somehow a bad version went through

switching to documentation

removed softmax before loss in googlenet

initial prototype fix

updated gitignore

updated googlenet with the best working model and add a note about the
auxillary branches

* Updated tensorboard documentation

updated documentation for using tensorboard

added warnings for using tensorboard not on chrome

changed to using bootbox.alert()

* Fixed linting

lint

initial prototype fix

lint

removed debug prints

lint

* Tf gans review (NVIDIA#9)

* initial commit for googlenet implementation

* mile stone on inception module

* finished googlenet inference

* finished googlenet and refactored a bit of other networks

* commiting to test this at the office

* fixed googlenet to get it working

* somehow a bad version went through

* switching to documentation

* removed softmax before loss in googlenet

* initial prototype fix

* updated gitignore

* a typo made through

* edited the gan examples to be compatable with TF 1.2

* lint

* added to test optimizers other than sgd

* pointed the celeba dataset to its main page

* merged rebase changes from development repo

* removing ADAM tests for caffe and torch due to incompatability

* readded adam tests but commented out torch due to tuning issues

* set version to 6.0

fixed linting and version number

reverting back to 5.1-dev for version
  • Loading branch information
ethantang95 committed Jul 6, 2017
1 parent 327887a commit fe66926
Show file tree
Hide file tree
Showing 369 changed files with 1,918 additions and 1,212 deletions.
13 changes: 13 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -17,3 +17,16 @@ TAGS
/build/
/dist/
*.egg-info/

#Intellij files
.idea/

#vscode
.vscode/

#.project
.project
/.project

#.tb
.tb/
11 changes: 5 additions & 6 deletions .travis.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Copyright (c) 2015-2016, NVIDIA CORPORATION. All rights reserved.
# Copyright (c) 2015-2017, NVIDIA CORPORATION. All rights reserved.

os: linux
dist: trusty
Expand All @@ -10,7 +10,6 @@ env:
global:
- CAFFE_ROOT=~/caffe
- TORCH_ROOT=~/torch
# Fixes for Torch and OpenBLAS
- OMP_NUM_THREADS=1
- OPENBLAS_MAIN_FREE=1
- secure: "WSqrE+PQm76DdoRLRGKTK6fRWfXZjIb0BWCZm3IgHgFO7OE6fcK2tBnpDNNw4XQjmo27FFWlEhxN32g18P84n5PvErHaH65IuS9Nv6FkLlPXZlVqGNxbPmEA4oTkD/6Y6kZyZWZtLh2+/1ijuzQAPnIy/4BEuL8pdO+PsoJ9hYM="
Expand All @@ -20,6 +19,7 @@ env:
- DIGITS_TEST_FRAMEWORK=torch
- DIGITS_TEST_FRAMEWORK=tensorflow
- DIGITS_TEST_FRAMEWORK=none
- DIGITS_TEST_FRAMEWORK=none WITH_PLUGINS=false

matrix:
include:
Expand All @@ -43,6 +43,7 @@ matrix:
- dput
- gnupg
install:
- git fetch --unshallow
- git remote add nvidia-digits-upstream https://github.com/NVIDIA/DIGITS.git # for forks
- git fetch nvidia-digits-upstream --tags
- pip install twine
Expand Down Expand Up @@ -130,13 +131,11 @@ install:
- echo "backend:agg" > ~/.config/matplotlib/matplotlibrc
- ./scripts/travis/install-caffe.sh $CAFFE_ROOT
- if [ "$DIGITS_TEST_FRAMEWORK" == "torch" ]; then travis_wait ./scripts/travis/install-torch.sh $TORCH_ROOT; else unset TORCH_ROOT; fi
- pip install -r ./requirements.txt --force-reinstall
- if [ "$DIGITS_TEST_FRAMEWORK" == "tensorflow" ]; then travis_wait ./scripts/travis/install-tensorflow.sh; fi
- pip install -r ./requirements.txt
- pip install -r ./requirements_test.txt
- pip install -e .
- pip install -e ./plugins/data/imageGradients
- pip install -e ./plugins/view/imageGradients
- if [ "$WITH_PLUGINS" != "false" ]; then find ./plugins/*/* -maxdepth 0 -type d | xargs -n1 pip install -e; fi

script:
- ./digits-test -v

2 changes: 1 addition & 1 deletion LICENSE
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
Copyright (c) 2014-2016, NVIDIA CORPORATION. All rights reserved.
Copyright (c) 2014-2017, NVIDIA CORPORATION. All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
Expand Down
5 changes: 4 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,13 @@

DIGITS (the **D**eep Learning **G**PU **T**raining **S**ystem) is a webapp for training deep learning models.

The currently supported frameworks are: Caffe 1, Torch, and Tensorflow

# Installation

| Installation method | Supported platform[s] | Available versions | Instructions |
| --- | --- | --- | --- |
| Deb packages | Ubuntu 14.04 | [14.04 repo](http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1404/x86_64) | [docs/UbuntuInstall.md](docs/UbuntuInstall.md) |
| Deb packages | Ubuntu 14.04, 16.04 | [14.04 repo](http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1404/x86_64), [16.04 repo](http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1604/x86_64) | [docs/UbuntuInstall.md](docs/UbuntuInstall.md) |
| Docker | Linux | [DockerHub tags](https://hub.docker.com/r/nvidia/digits/tags/) | [nvidia-docker wiki](https://github.com/NVIDIA/nvidia-docker/wiki/DIGITS) |
| Source | Ubuntu 14.04, 16.04 | [GitHub tags](https://github.com/NVIDIA/DIGITS/releases) | [docs/BuildDigits.md](docs/BuildDigits.md) |

Expand All @@ -18,6 +20,7 @@ Once you have installed DIGITS, visit [docs/GettingStarted.md](docs/GettingStart

Then, take a look at some of the other documentation at [docs/](docs/) and [examples/](examples/):

* [Getting started with TensorFlow](docs/GettingStartedTensorflow.md)
* [Getting started with Torch](docs/GettingStartedTorch.md)
* [Fine-tune a pretrained model](examples/fine-tuning/README.md)
* [Train an autoencoder network](examples/autoencoder/README.md)
Expand Down
2 changes: 1 addition & 1 deletion digits-devserver
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
#!/bin/bash
# Copyright (c) 2016, NVIDIA CORPORATION. All rights reserved.
# Copyright (c) 2016-2017, NVIDIA CORPORATION. All rights reserved.

set -e

Expand Down
6 changes: 3 additions & 3 deletions digits-lint
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
#!/bin/bash
# Copyright (c) 2016, NVIDIA CORPORATION. All rights reserved.
# Copyright (c) 2016-2017, NVIDIA CORPORATION. All rights reserved.

set -e

echo "=== Checking for Python lint ..."
if which flake8 >/dev/null 2>&1; then
python2 `which flake8` .
python2 `which flake8` --exclude ./examples,./digits/standard-networks/tensorflow,./digits/jobs .
else
python2 -m flake8 .
python2 -m flake8 --exclude ./examples,./digits/standard-networks/tensorflow,./digits/jobs .
fi

echo "=== Checking for JavaScript lint ..."
Expand Down
2 changes: 1 addition & 1 deletion digits-test
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
#!/bin/bash
# Copyright (c) 2014-2016, NVIDIA CORPORATION. All rights reserved.
# Copyright (c) 2014-2017, NVIDIA CORPORATION. All rights reserved.

set -e

Expand Down
240 changes: 0 additions & 240 deletions digits-walkthrough

This file was deleted.

2 changes: 1 addition & 1 deletion digits/__init__.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Copyright (c) 2014-2016, NVIDIA CORPORATION. All rights reserved.
# Copyright (c) 2014-2017, NVIDIA CORPORATION. All rights reserved.
from __future__ import absolute_import

from .version import __version__
Expand Down
2 changes: 1 addition & 1 deletion digits/__main__.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Copyright (c) 2014-2016, NVIDIA CORPORATION. All rights reserved.
# Copyright (c) 2014-2017, NVIDIA CORPORATION. All rights reserved.

import argparse
import os.path
Expand Down
2 changes: 1 addition & 1 deletion digits/config/__init__.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Copyright (c) 2015-2016, NVIDIA CORPORATION. All rights reserved.
# Copyright (c) 2015-2017, NVIDIA CORPORATION. All rights reserved.
from __future__ import absolute_import

# Create this object before importing the following imports, since they edit the list
Expand Down
2 changes: 1 addition & 1 deletion digits/config/caffe.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Copyright (c) 2015-2016, NVIDIA CORPORATION. All rights reserved.
# Copyright (c) 2015-2017, NVIDIA CORPORATION. All rights reserved.
from __future__ import absolute_import

import imp
Expand Down
2 changes: 1 addition & 1 deletion digits/config/gpu_list.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Copyright (c) 2015-2016, NVIDIA CORPORATION. All rights reserved.
# Copyright (c) 2015-2017, NVIDIA CORPORATION. All rights reserved.
from __future__ import absolute_import

from . import option_list
Expand Down
Loading

0 comments on commit fe66926

Please sign in to comment.