Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

速度对比 #22

Open
yohnyang opened this issue Jun 7, 2024 · 1 comment
Open

速度对比 #22

yohnyang opened this issue Jun 7, 2024 · 1 comment

Comments

@yohnyang
Copy link

yohnyang commented Jun 7, 2024

你好,这个速度比LoFTR真的快2.5倍吗? 我设置的参数为 opt fp16 时间在70ms左右,LoFTR在180ms左右,GTX2080super+cuda117

请问我还有可以调参优化的地方吗?

@wyf2020
Copy link
Contributor

wyf2020 commented Jun 7, 2024

你好,这个结果是符合预期的(180ms/70ms=2.57x),继续调参优化可以开启Flash Attention (cfg.LOFTR.COARSE.NO_FLASH=False),以及如果对latency要求不高并且输入图像分辨率较低(分辨率通过forward期间GPU利用率是否接近100%判断),可以调参batch size=2^N (N>=1),获得成倍throughput rate的提升。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants