- TensorRT 5.0 GA
- Tensorflow with GPU support
- PyCUDA
- Python3
- Cmake (>= 3.8)
Assume that the PB file is located at <path to this project>/data
and named se-resnext.pb
$ cd <path to this project>/data
$ python3 <path to uff-converter-tf>/convert_to_uff.py <your PB file> -p preprocess.py
For instance:
$ cd data
$ python3 /usr/local/lib/python3.5/dist-packages/uff/bin/convert_to_uff.py se-resnext.pb -p preprocess.py
# or
$ python3 /usr/lib/python3.5/dist-packages/uff/bin/convert_to_uff.py se-resnext.pb -p preprocess.py
You should get an UFF file which may be named se-resnext.uff
in data
folder
$ cd <path to this project>/verification
$ python3 tf_sample.py
$ cd <path to this project>/verification
$ python3 trt_sample.py
The results should be same.
$ cd <path to this project>/data
$ <path to TensorRT>/trtexec --uff=<your UFF file> --output=softmax --uffInput=<input name>,3,224,224 --batch=<batch size>
For instance:
$ cd data
$ /usr/src/tensorrt/bin/trtexec --uff=se-resnext.uff --output=softmax --uffInput=tf_feed_image,3,224,224 --batch=32
$ mkdir <path to this project>/build
$ cd <path to this project>/build
$ cmake ..
$ make -j2
$ cd <path to this project>/build
# It is slow at first time because of generating TensorRT engine binary
$ ./trt_se_resnext
# The executable will use TensorRT engine binary at second time. It will be much faster in initialization
$ ./trt_se_resnext
- CUDA 10
- cuDNN 7.3.1
- TensorRT 5.0 GA
- Tensorflow 18.11-py3 from NGC
- Ubuntu 16.04
- Cmake 3.8