Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add FP16 support to batchedNMSPlugin #1002

Merged
merged 5 commits into from
Jan 14, 2021

Conversation

pbridger
Copy link
Contributor

Add support for float16 boxes and scores (but not mixed precision between boxes and scores).

Gives inference results within expected numerical tolerances using SSD300 and COCO2017 validation set. (As detailed here https://paulbridger.com/posts/tensorrt-object-detection-quantized/).

Despite the "#if CUDA_ARCH >= 800" this has not been tested on Ampere, only Turing arch.

@rajeevsrao
Copy link
Collaborator

Thanks @pbridger. Can you please sign your commits with git commit --amend -s

@pranavm-nvidia
Copy link
Collaborator

@rajeevsrao Does this need to be integrated into master as well?

@rajeevsrao
Copy link
Collaborator

@rajeevsrao Does this need to be integrated into master as well?

Yes, but we will cherry-pick it internally first, test and release to master via 21.0x and then merge this change.

@rajeevsrao
Copy link
Collaborator

Thanks @pbridger. Will merge this commit once the corresponding cherry-pick for master is posted alongwith the 21.02 container update.

@rajeevsrao rajeevsrao merged commit 7ca28ec into NVIDIA:release/7.2 Jan 14, 2021
@pbridger
Copy link
Contributor Author

Cool! Many thanks @rajeevsrao and @pranavm-nvidia for all your work to include this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants