We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
你好,这个速度比LoFTR真的快2.5倍吗? 我设置的参数为 opt fp16 时间在70ms左右,LoFTR在180ms左右,GTX2080super+cuda117
请问我还有可以调参优化的地方吗?
The text was updated successfully, but these errors were encountered:
你好,这个结果是符合预期的(180ms/70ms=2.57x),继续调参优化可以开启Flash Attention (cfg.LOFTR.COARSE.NO_FLASH=False),以及如果对latency要求不高并且输入图像分辨率较低(分辨率通过forward期间GPU利用率是否接近100%判断),可以调参batch size=2^N (N>=1),获得成倍throughput rate的提升。
Sorry, something went wrong.
No branches or pull requests
你好,这个速度比LoFTR真的快2.5倍吗? 我设置的参数为 opt fp16 时间在70ms左右,LoFTR在180ms左右,GTX2080super+cuda117
请问我还有可以调参优化的地方吗?
The text was updated successfully, but these errors were encountered: